Skip to content

How to form groups for this specific problem?

5 messages · Sarah Goslee, Satish Vadlamani, Adams, Jean

#
Hello All:
I would like to get some help with the following problem and understand how
this can be done in R efficiently. The header is given in the data frame.

*Component, TLA*
C1, TLA1
C2, TLA1
C1, TLA2
C3, TLA2
C4, TLA3
C5, TLA3

Notice that C1 is a component of TLA1 and TLA2.

I would like to form groups of mutually exclusive subsets and create a new
column called group for this subset. For the above data, the subsets and
the new group column value will be like so:

*Component, TLA, Group*
C1, TLA1, 1
C2, TLA1, 1
C1, TLA2, 1
C3, TLA2, 1
C4, TLA3, 2
C5, TLA3, 2

Appreciate any help on this. I could have looped through the observations
and tried some logic but I did not try that yet.
#
It isn't at all clear to me how you are creating the groups. They
aren't the unique combinations of Component and TLA. They might be
based only on TLA value: in your example TLA1 and TLA2 form one group,
and TLA3 the other.

Without understanding your logic, I can't replicate it with R code.

Sarah

On Sun, Mar 27, 2016 at 8:56 PM, Satish Vadlamani
<satish.vadlamani at gmail.com> wrote:
And please don't post in HTML.
#
Satish,

If you rearrange your data into a network of nodes and edges, you can use
the igraph package to identify disconnected (mutually exclusive) groups.

# example data
df <- data.frame(
  Component = c("C1", "C2", "C1", "C3", "C4", "C5"),
  TLA = c("TLA1", "TLA1", "TLA2", "TLA2", "TLA3", "TLA3")
)

# characterize data as a network of nodes and edges
nodes <- levels(unlist(df))
edges <- apply(df, 2, match, nodes)

# use the igraph package to identify disconnected groups
library(igraph)
g <- graph(edges)
ngroup <- clusters(g)$membership
df$Group <- ngroup[match(df$Component, nodes)]
df

  Component  TLA Group
1        C1 TLA1     1
2        C2 TLA1     1
3        C1 TLA2     1
4        C3 TLA2     1
5        C4 TLA3     2
6        C5 TLA3     2

Jean

On Sun, Mar 27, 2016 at 7:56 PM, Satish Vadlamani <
satish.vadlamani at gmail.com> wrote:

            

  
  
#
Jean:
Wow. Thank you so much for this. I will read up igraph and then see if this
is going to work for me for the larger dataset.

Thanks for the wonderful snippet code you wrote. Basically, the requirement
is this:
TLA1 (Top Level Assembly) and its components should belong to the same
group. If a component belongs to a different TLA (say TLA2), then that TLA1
and all of its components should belong to the same as that of TLA1.

Are these types of questions appropriate for this group?

Thanks,
Satish
On Mar 28, 2016 9:10 AM, "Adams, Jean" <jvadams at usgs.gov> wrote:

            

  
  
#
You're welcome, Satish.

Yes, questions that are seeking solutions in R code are appropriate for
this group.  It's helpful if you provide sample data (for example, using
dput()) and sample R code that folks can use.  And it's helpful if you show
the results that you are hoping to achieve (as you did).

Jean

On Mon, Mar 28, 2016 at 1:15 PM, Satish Vadlamani <
satish.vadlamani at gmail.com> wrote: