Skip to content
Back to formatted view

Raw Message

Message-ID: <570ADACC.6010306@gmail.com>
Date: 2016-04-10T22:59:24Z
From: Fabien Tarrade
Subject: what is the faster way to search for a pattern in a few million entries data frame ?
In-Reply-To: <CA+8X3fUnRS7UThLHN9YCASRBDatnZ-qoH+Sh1A3WcrWK_rzcZQ@mail.gmail.com>

Hi Jim,

I didn't know this one. I will have a look.

Thanks
Cheers
Fabien
> Hi Fabien,
> I was going to send this last night, but I thought it was too simple.
> Runs in about one millisecond.
>
> df<-data.frame(freq=runif(1000),
>   strings=apply(matrix(sample(LETTERS,10000,TRUE),ncol=10),
>   1,paste,collapse=""))
> match.ind<-grep("DF",df$strings)
> match.ind
>   [1]   2  11  91 133 169 444 547 605 734 943
>
> Jim

-- 
Dr Fabien Tarrade

Quantitative Analyst/Developer - Data Scientist

Senior data analyst specialised in the modelling, processing and 
statistical treatment of data.
PhD in Physics, 10 years of experience as researcher at the forefront of 
international scientific research.
Fascinated by finance and data modelling.

Geneva, Switzerland

Email : contact at fabien-tarrade.eu <mailto:contact at fabien-tarrade.eu>
Phone : www.fabien-tarrade.eu <http://www.fabien-tarrade.eu>
Phone : +33 (0)6 14 78 70 90

LinkedIn <http://ch.linkedin.com/in/fabientarrade/> Twitter 
<https://twitter.com/fabtar> Google 
<https://plus.google.com/+FabienTarradeProfile/posts> Facebook 
<https://www.facebook.com/fabien.tarrade.eu> Google 
<skype:fabtarhiggs?call> Xing <https://www.xing.com/profile/Fabien_Tarrade>