Variable substitution in grep pattern

Thu, Jan 29, 2004 11:07 AM #

Hi everibody.
I'm working with a dataframe with many character vector in which each
observation is made of one or more unique values.
Example:

[1] BSD License, GNU Library or Lesser General Public License (LGPL)
[2] Qt Public License (QPL)
[3] GNU General Public License (GPL)
66 Levels:  ... Zope Public License

As you can see, the observation can have one or more Licenses associated
with them.
I want to build a vector with the number of times every element (e.g.
"BSD License") occurs in the vector, by itself or in association with
others (i.e. I want to count the elements containing "BSD License" as
well as those containing "BSD License, GNU Library or Lesser General
Public License (LGPL)", and so on).

I've tried to use a "for" loop as follows:

+ Licenza.elenco.prova[Licenza.elenco==i] <-
  length(grep(".*i.*",as.character(Licenza)))}

In which Licenza.elenco is a character vector containing all unique
values I need to match (e.g. BSD License, Qt Public License (QPL), GNU
General Public License (GPL)).
However R handles as I expect only the first variable substitution (the
index), but grep matches all strings containing the letter "i", that is
100% of the vector, except NAs of course.
After running the above code I get:

[1] 2235 2235 2235

I've tried escaping the variable name, enclosing it in brackets, but
nothing works as I want.
I'm sure I'm doing something wrong, but what?

Thaks in advance

Alberto Fornasier

Thomas Lumley

Thu, Jan 29, 2004 6:55 PM #

On Thu, 29 Jan 2004, Alberto Fornasier wrote:

You can't do that.  If you could , how would you search for all strings
containing the letter "i"?

You need to use something like paste() to construct the pattern

length(grep(paste(".*",i,".*",sep=""),as.character(Licenza)))}

	-thomas