Skip to content

random cdisc data very slow for larger data #21

@cicdguy

Description

@cicdguy

Original message

Running the following code takes a long time! This is on r.roche.com, r 3.6.3

NEST/nest_on_bee/master/bee_nest_utils.R")
bee_use_nest(release = "2021_05_05")
ADSL <- radsl(N = 1002)
ADLB <- radlb(ADSL)

I reduced this from 15000 as it took way too long. Using system.time I get the following results:

user system elapsed
37.852 0.584 38.436

This is extremely long to make a dataset with 21,000 records! I know random.cdisc really only exists for dummy data, but this seems like extremely poor performance

Provenance:

Creator: martik32

TODO

Improve performance. A few suggestion

  1. use mclapply
  2. datatable if necessary

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingsme

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions