Message-ID: <CAM_vjukZefuVqUa9xWKU5qK7x215AXJW18Aws=0iO6AJ5hw9Bg@mail.gmail.com>
Date: 2012-05-22T12:53:22Z
From: Sarah Goslee
Subject: Problem with Extracting Hash Tagged Words from Tweets
In-Reply-To: <1337684420.86378.YahooMailNeo@web65501.mail.ac4.yahoo.com>
Hi,
A small reproducible bit of your data would have been nice, and I have
no idea what "manually remove all regular expressions" might mean, but
take a look at this:
x <- list("marymaryw: Get an insight into how journalists operate at
The News by following #dayatthenews today #pompeyhacks #portsmouth
#southsea", "VouchAR_Ports: ?5 instead of ?60 for 1 month of unlimited
fitness classes at Outdoor Fitness Leeds - get bikini...
http://t.co/BUrkjtCh #Portsmouth", "BillieRaePhoto: RT @vintagesecret:
My dad has just sent me this picture. Looks like @GunwharfQuays is on
fire?! #portsmouth http://t.co/HbAV7Hw0")
> sapply(x, str_extract_all, "#\\<.*?\\>")
[[1]]
[1] "#dayatthenews" "#pompeyhacks" "#portsmouth" "#southsea"
[[2]]
[1] "#Portsmouth"
[[3]]
[1] "#portsmouth"
Sarah
On Tue, May 22, 2012 at 7:00 AM, Adedoyin-Olowe Mariam
<mariamolowe2008 at yahoo.com> wrote:
> Hello All,
> Can anyone help me solve this problem.
> Am trying to extract hash-tagged words from tweets downloaded from twitteR.
>
> I can extract hash-tagged words from single tweet using (stringr)?str_extract_all(tweets, "#[a-z//A-Z//0-9]+")
> but cannot with more than one tweet at a time except I manually remove all regular expressions and tweets numbers such as [[1]] and [1.]
>
> I want to automatically extract all #words in large number of tweets at a go.
> This is what I have done so far by removing all regular expressions manually:
>
>> searchTwitter("#Portsmouth", n=20) [[1]]
> [1] "marymaryw: Get an insight into how journalists operate at The News by following #dayatthenews today #pompeyhacks #portsmouth #southsea"
> [[2]]
> [1] "VouchAR_Ports: ?5 instead of ?60 for 1 month of unlimited fitness classes at Outdoor Fitness Leeds - get bikini... http://t.co/BUrkjtCh #Portsmouth"
> [[3]]
> [1] "BillieRaePhoto: RT @vintagesecret: My dad has just sent me this picture. Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0"
> [[4]]
> [1] "xangma: RT @vintagesecret: My dad has just sent me this picture. Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0"
> [[5]]
> [1] "vintagesecret: My dad has just sent me this picture. Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0"
> [[6]]
> [1] "i_amnik: RT @BBCRadioSolent: Can you see the #GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961. #Portsmouth."
> [[7]]
> [1] "vickiredmond: RT @dan_germain: RT @MatMacAulay: Best pic of #Gunwharf on fire I have seen http://t.co/8LNAiqiD #portsmouth"
> [[8]]
> [1] "EmilieRosa: Highs of 25 degrees on the island this week!! Beach time after exams I think! ;) #Portsmouth"
> [[9]]
> [1] "MrYiff: RT @dan_germain: RT @MatMacAulay: Best pic of #Gunwharf on fire I have seen http://t.co/8LNAiqiD #portsmouth"
> [[10]]
> [1] "otbsaad: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM"
> [[11]]
> [1] "PN_Newsdesk: #Portsmouth: Ferryspeed looks to build on its past successes http://t.co/CmDglDkg"
> [[12]]
> [1] "PN_Newsdesk: #Portsmouth: More room for stalls at top Southsea school - A SOUTHSEA primary school still has room for people to se... http://t.co/ucbYWjPR"
> [[13]]
> [1] "VouchAR_Ports: ?14 instead of ?30 for a pedicure with foiled transfer at Forever Young, Stoke-on-Trent - get... http://t.co/P7gJBcl8 #Portsmouth"
> [[14]]
> [1] "TelArnott: Looking forward to #K1 today! #gym01 #portsmouth"
> [[15]]
> [1] "dan_germain: RT @MatMacAulay: Best pic of #Gunwharf on fire I have seen http://t.co/8LNAiqiD #portsmouth"
> [[16]]
> [1] "dan_germain: RT @portsmouthnews: News: Large fire at Gunwharf Quays - http://t.co/s9RWpY0i #portsmouth #southsea"
> [[17]]
> [1] "i_amnik: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM"
> [[18]]
> [1] "solentmotorcars: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM"
> [[19]]
> [1] "HantsChiefAlex: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM"
> [[20]]
> [1] "BBCRadioSolent: Can you see the #GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961. #Portsmouth."
>> tweets <-c("marymaryw: Get an insight into how journalists operate at The News by following #dayatthenews today #pompeyhacks #portsmouth #southsea VouchAR_Ports ?5 instead of ?60 for 1 month of unlimited fitness classes at Outdoor Fitness Leeds - get bikini... http://t.co/BUrkjtCh #Portsmouth BillieRaePhoto RT @vintagesecret My dad has just sent me this picture. Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0 xangma: RT @vintagesecret My dad has just sent me this picture. Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0 vintagesecret My dad has just sent me this picture. Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0iamnik: RT @BBCRadioSolent Can you see the #GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961. #Portsmouth. vickiredmond @MatMacAulay Best pic of#Gunwharf on fire I have seen http://t.co/8LNAiqiD #portsmouth EmilieRosa: Highs of 25 degrees on the island
> ?this week!! Beach time after exams I think!) #Portsmouth mYiff RT @dan_germain: RT @MatMacAulay Best pic of #Gunwharf on fire I have seen http://t.co/8LNAiqiD #portsmouth otbsaad RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM PN_Newsdesk #Portsmouth: Ferryspeed looks to build on its past successes http://t.co/CmDglDkg PN_Newsdesk #Portsmouth More room for stalls at top Southsea school - A SOUTHSEA primary school still has room for people to se... http://t.co/ucbYWjPR VouchAR_Ports ?14 instead of ?30 for a pedicure with foiled transfer at Forever Young, Stoke-on-Trent - get... http://t.co/P7gJBcl8 #Portsmouth TelArnott Looking forward to #K1 today! #gym01 #portsmouth Best pic of #Gunwharf on fire I have seen http://t.co/8LNAiqiD #portsmouth dangermain RT @portsmouthnews News Large fire at Gunwharf Quays - http://t.co/s9RWpY0i #portsmouth #southsea iamnik RT
> ?@BBCRadioSolent BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM solentmotorcars RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM HantsChiefAlex RT @BBCRadioSolent BREAKING NEWS - Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM BBCRadioSolent Can you see the #GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961. #Portsmouth")
>> str_extract_all(tweets, "#[a-z//A-Z//0-9]+")
> [[1]]
> ?[1] "#dayatthenews" ?"#pompeyhacks" ? "#portsmouth" ? ?"#southsea" ? ? ?"#Portsmouth" ? ?"#portsmouth" ? ?"#portsmouth"
> ?[8] "#portsmouth" ? ?"#GunwharfQuays" "#Portsmouth" ? ?"#Gunwharf" ? ? ?"#portsmouth" ? ?"#Portsmouth" ? ?"#Gunwharf"
> [15] "#portsmouth" ? ?"#GunwharfQuays" "#Portsmouth" ? ?"#Portsmouth" ? ?"#Portsmouth" ? ?"#Portsmouth" ? ?"#K1"
> [22] "#gym01" ? ? ? ? "#portsmouth" ? ?"#Gunwharf" ? ? ?"#portsmouth" ? ?"#portsmouth" ? ?"#southsea" ? ? ?"#GunwharfQuays"
> [29] "#Portsmouth" ? ?"#GunwharfQuays" "#Portsmouth" ? ?"#GunwharfQuays" "#Portsmouth" ? ?"#GunwharfQuays" "#Portsmouth"
>
> Please I need help.
>
> Mariam
--
Sarah Goslee
http://www.functionaldiversity.org