r - How to do multi-key lookups using data.table? -
I use data.table to do some repeated lookups on a large dataset (45M rows, 4 integer columns) I am here.
This is what I want to do.
Library (Data Eligible) # Generates some data, you can show in several SDKs and LTs - Datatable (U = (1: 500,2 for each representative) , S = Round (Rooneyf (1000,1,100), 0)) Satky (D1, U, S) #, I want to see all their searches & lt; - D1 [J (U = 1), "s", together = F] # For each of the above data, # I want to seek other u from data.table d1 # of the original # does not work: Other & lt; - D1 [J (S = U), with "U", = F] # this works, but my big dataset actually takes long: other & lt; - Merge (D1, us, = 's')
Merge the work for my purpose, but since my 'D1' >>> 'us', this It takes a long time. At first I thought that I am using merge from the base, but based on the docs it looks like the data. Eligible merge is sent (first_arange for class) a data is qualified.
P> Can you fulfill it? Can change the key for that purpose. s / Code> Values are grouped together.
us 1: 20 1 2: 35 1 3: 36 1 4: 87 1 5: 123 1 --- 996: 208 100 997: 262 100 998: 352 100 99 9: 430 100 1000: 455 100 Operations carried out on groups defined by 100 key columns generally work very fast, such as <<> D1 [, Mean (U), Kebai = 's']
If you want to accelerate aggregation for both groups, then u
and s
, you data.table one for the
setkey (D1, U, S)
and other for the setkey (D1, S, U) of
Use Please. If you want to quickly move on the groups defined by the values of u
then the use of pre-data otherwise the latter
Comments
Post a Comment