what is the faster way to search for a pattern in a few million entries data frame ?

Sun, Apr 10, 2016 12:27 PM

Hi Duncan,

sorry, I got a email that my message was waiting for approval and when I 
look at the forum I didn't see my message and this is why  I sent it 
again and this time I did check that the format of my message was text 
only. Sorry for the noise.

my strings are 1-gram up to 5-grams (sequence of 1 work up to 5 words) 
and I am searching for the frequency in my DF of the strings starting 
with a sequence of few words.

I guess these days it is standard to use DF with millions of entries so 
I was wondering how people are doing that in the faster way.

Thanks
Cheers
Fabien

Dr Fabien Tarrade

Quantitative Analyst/Developer - Data Scientist

Senior data analyst specialised in the modelling, processing and 
statistical treatment of data.
PhD in Physics, 10 years of experience as researcher at the forefront of 
international scientific research.
Fascinated by finance and data modelling.

Geneva, Switzerland

Email : <mailto:contact at fabien-tarrade.eu>contact at fabien-tarrade.eu
Phone : <http://www.fabien-tarrade.eu>www.fabien-tarrade.eu
Phone : +33 (0)6 14 78 70 90

LinkedIn <http://ch.linkedin.com/in/fabientarrade/> Twitter 
<https://twitter.com/fabtar> Google 
<https://plus.google.com/+FabienTarradeProfile/posts> Facebook 
<https://www.facebook.com/fabien.tarrade.eu> Google 
<skype:fabtarhiggs?call> Xing <https://www.xing.com/profile/Fabien_Tarrade>

what is the faster way to search for a pattern in a few million entries data frame ?

Thread (8 messages)