Problem with Extracting Hash Tagged Words from Tweets
Hi, On Tue, May 22, 2012 at 10:55 AM, Adedoyin-Olowe Mariam
<mariamolowe2008 at yahoo.com> wrote:
Hi Sarah, Thanks for your help. I'm sorry my question is not clear enough. Maybe what I should ask for is how to remove the downloaded tweet numbers in x <- list (ie.[[1]], [1], [[2]], [2].....) before > sapply(x, str_extract_all, "#\\<.*?\\>").
Those aren't part of the tweets. Those are the numbers R uses when displaying portions of a list.
The presence of these numbers in square brackets is reporting error.
What error? You'll need to give us an actual reproducible example, since what you are describing is unclear. Although I suppose it's possible that you simply want:
unlist(sapply(x, str_extract_all, "#\\<.*?\\>"))
[1] "#dayatthenews" "#pompeyhacks" "#portsmouth" "#southsea" [5] "#Portsmouth" "#portsmouth" It's impossible for me to tell precisely what the problem is. Sarah
Thanks. Mariam
________________________________
From: Sarah Goslee <sarah.goslee at gmail.com>
To: Adedoyin-Olowe Mariam <mariamolowe2008 at yahoo.com>
Cc: "r-help at r-project.org" <r-help at r-project.org>
Sent: Tuesday, 22 May 2012, 13:53
Subject: Re: [R] Problem with Extracting Hash Tagged Words from Tweets
Hi,
A small reproducible bit of your data would have been nice, and I have
no idea what "manually remove all regular expressions" might mean, but
take a look at this:
x <- list("marymaryw: Get an insight into how journalists operate at
The News by following #dayatthenews today #pompeyhacks #portsmouth
#southsea", "VouchAR_Ports: ?5 instead of ?60 for 1 month of unlimited
fitness classes at Outdoor Fitness Leeds - get bikini...
http://t.co/BUrkjtCh #Portsmouth", "BillieRaePhoto: RT @vintagesecret:
My dad has just sent me this picture. Looks like @GunwharfQuays is on
fire?! #portsmouth http://t.co/HbAV7Hw0")
sapply(x, str_extract_all, "#\\<.*?\\>")
[[1]]
[1] "#dayatthenews" "#pompeyhacks"? "#portsmouth"? "#southsea"
[[2]]
[1] "#Portsmouth"
[[3]]
[1] "#portsmouth"
Sarah
On Tue, May 22, 2012 at 7:00 AM, Adedoyin-Olowe Mariam
<mariamolowe2008 at yahoo.com> wrote:
Hello All,
Can anyone help me solve this problem.
Am trying to extract hash-tagged words from tweets downloaded from
twitteR.
I can extract hash-tagged words from single tweet using
(stringr)?str_extract_all(tweets, "#[a-z//A-Z//0-9]+")
but cannot with more than one tweet at a time except I manually remove all
regular expressions and tweets numbers such as [[1]] and [1.]
I want to automatically extract all #words in large number of tweets at a
go.
This is what I have done so far by removing all regular expressions
manually:
searchTwitter("#Portsmouth", n=20) [[1]]
[1] "marymaryw: Get an insight into how journalists operate at The News by
following #dayatthenews today #pompeyhacks #portsmouth #southsea"
[[2]]
[1] "VouchAR_Ports: ?5 instead of ?60 for 1 month of unlimited fitness
classes at Outdoor Fitness Leeds - get bikini... http://t.co/BUrkjtCh
#Portsmouth"
[[3]]
[1] "BillieRaePhoto: RT @vintagesecret: My dad has just sent me this
picture. Looks like @GunwharfQuays is on fire?! #portsmouth
http://t.co/HbAV7Hw0"
[[4]]
[1] "xangma: RT @vintagesecret: My dad has just sent me this picture.
Looks like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0"
[[5]]
[1] "vintagesecret: My dad has just sent me this picture. Looks like
@GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0"
[[6]]
[1] "i_amnik: RT @BBCRadioSolent: Can you see the #GunwharfQuays fire?
Eye-witnesses please call - 0845 30 30 961. #Portsmouth."
[[7]]
[1] "vickiredmond: RT @dan_germain: RT @MatMacAulay: Best pic of #Gunwharf
on fire I have seen http://t.co/8LNAiqiD #portsmouth"
[[8]]
[1] "EmilieRosa: Highs of 25 degrees on the island this week!! Beach time
after exams I think! ;) #Portsmouth"
[[9]]
[1] "MrYiff: RT @dan_germain: RT @MatMacAulay: Best pic of #Gunwharf on
fire I have seen http://t.co/8LNAiqiD #portsmouth"
[[10]]
[1] "otbsaad: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire
at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM"
[[11]]
[1] "PN_Newsdesk: #Portsmouth: Ferryspeed looks to build on its past
successes http://t.co/CmDglDkg"
[[12]]
[1] "PN_Newsdesk: #Portsmouth: More room for stalls at top Southsea school
- A SOUTHSEA primary school still has room for people to se...
http://t.co/ucbYWjPR"
[[13]]
[1] "VouchAR_Ports: ?14 instead of ?30 for a pedicure with foiled transfer
at Forever Young, Stoke-on-Trent - get... http://t.co/P7gJBcl8 #Portsmouth"
[[14]]
[1] "TelArnott: Looking forward to #K1 today! #gym01 #portsmouth"
[[15]]
[1] "dan_germain: RT @MatMacAulay: Best pic of #Gunwharf on fire I have
seen http://t.co/8LNAiqiD #portsmouth"
[[16]]
[1] "dan_germain: RT @portsmouthnews: News: Large fire at Gunwharf Quays -
http://t.co/s9RWpY0i #portsmouth #southsea"
[[17]]
[1] "i_amnik: RT @BBCRadioSolent: BREAKING NEWS - Reports of a large fire
at #GunwharfQuays in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM"
[[18]]
[1] "solentmotorcars: RT @BBCRadioSolent: BREAKING NEWS - Reports of a
large fire at #GunwharfQuays in #Portsmouth. Latest updates on
@BBCRadioSolent 96.1FM"
[[19]]
[1] "HantsChiefAlex: RT @BBCRadioSolent: BREAKING NEWS - Reports of a
large fire at #GunwharfQuays in #Portsmouth. Latest updates on
@BBCRadioSolent 96.1FM"
[[20]]
[1] "BBCRadioSolent: Can you see the #GunwharfQuays fire? Eye-witnesses
please call - 0845 30 30 961. #Portsmouth."
tweets <-c("marymaryw: Get an insight into how journalists operate at The
News by following #dayatthenews today #pompeyhacks #portsmouth #southsea
VouchAR_Ports ?5 instead of ?60 for 1 month of unlimited fitness classes at
Outdoor Fitness Leeds - get bikini... http://t.co/BUrkjtCh #Portsmouth
BillieRaePhoto RT @vintagesecret My dad has just sent me this picture. Looks
like @GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0 xangma: RT
@vintagesecret My dad has just sent me this picture. Looks like
@GunwharfQuays is on fire?! #portsmouth http://t.co/HbAV7Hw0 vintagesecret
My dad has just sent me this picture. Looks like @GunwharfQuays is on fire?!
#portsmouth http://t.co/HbAV7Hw0iamnik: RT @BBCRadioSolent Can you see the
#GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961.
#Portsmouth. vickiredmond @MatMacAulay Best pic of#Gunwharf on fire I have
seen http://t.co/8LNAiqiD #portsmouth EmilieRosa: Highs of 25 degrees on the
island
?this week!! Beach time after exams I think!) #Portsmouth mYiff RT
@dan_germain: RT @MatMacAulay Best pic of #Gunwharf on fire I have seen
http://t.co/8LNAiqiD #portsmouth otbsaad RT @BBCRadioSolent: BREAKING NEWS -
Reports of a large fire at #GunwharfQuays in #Portsmouth. Latest updates on
@BBCRadioSolent 96.1FM PN_Newsdesk #Portsmouth: Ferryspeed looks to build on
its past successes http://t.co/CmDglDkg PN_Newsdesk #Portsmouth More room
for stalls at top Southsea school - A SOUTHSEA primary school still has room
for people to se... http://t.co/ucbYWjPR VouchAR_Ports ?14 instead of ?30
for a pedicure with foiled transfer at Forever Young, Stoke-on-Trent -
get... http://t.co/P7gJBcl8 #Portsmouth TelArnott Looking forward to #K1
today! #gym01 #portsmouth Best pic of #Gunwharf on fire I have seen
http://t.co/8LNAiqiD #portsmouth dangermain RT @portsmouthnews News Large
fire at Gunwharf Quays - http://t.co/s9RWpY0i #portsmouth #southsea iamnik
RT
?@BBCRadioSolent BREAKING NEWS - Reports of a large fire at #GunwharfQuays
in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM solentmotorcars RT
@BBCRadioSolent: BREAKING NEWS - Reports of a large fire at #GunwharfQuays
in #Portsmouth. Latest updates on @BBCRadioSolent 96.1FM HantsChiefAlex RT
@BBCRadioSolent BREAKING NEWS - Reports of a large fire at #GunwharfQuays in
#Portsmouth. Latest updates on @BBCRadioSolent 96.1FM BBCRadioSolent Can you
see the #GunwharfQuays fire? Eye-witnesses please call - 0845 30 30 961.
#Portsmouth")
str_extract_all(tweets, "#[a-z//A-Z//0-9]+")
[[1]]
?[1] "#dayatthenews" ?"#pompeyhacks" ? "#portsmouth" ? ?"#southsea"
?"#Portsmouth" ? ?"#portsmouth" ? ?"#portsmouth"
?[8] "#portsmouth" ? ?"#GunwharfQuays" "#Portsmouth" ? ?"#Gunwharf"
?"#portsmouth" ? ?"#Portsmouth" ? ?"#Gunwharf"
[15] "#portsmouth" ? ?"#GunwharfQuays" "#Portsmouth" ? ?"#Portsmouth"
?"#Portsmouth" ? ?"#Portsmouth" ? ?"#K1"
[22] "#gym01" ? ? ? ? "#portsmouth" ? ?"#Gunwharf" ? ? ?"#portsmouth"
?"#portsmouth" ? ?"#southsea" ? ? ?"#GunwharfQuays"
[29] "#Portsmouth" ? ?"#GunwharfQuays" "#Portsmouth" ? ?"#GunwharfQuays"
"#Portsmouth" ? ?"#GunwharfQuays" "#Portsmouth"
Please I need help.
Mariam
Sarah Goslee http://www.functionaldiversity.org