Dataframe Manipulation

Hi Hemant,

data_help <- data_help %>%
# Add a dummy index for each purchase to keep a memory of the purchase
since it will dissappear later on. You could also use row number
mutate(Purchase_ID = 1:n()) %>%
# For each purchase id
group_by(Purchase_ID) %>%
# Call the split_items function, which returns a data.frame
do(split_items(.))

cat_help %>%
# Make the data.frame long where the column names are gathered in a dummy
column and the items (the content of each column) in another column called
Item
gather("Foo", "Item") %>%
filter(!is.na(Item)) %>%
left_join(data_help, by = "Item") %>%
group_by(Foo, Purchase_ID) %>%
# Combine the items for each purchase and item type and make a wide
data.frame
summarise(Item = paste(Item, collapse = ", ")) %>%
spread(key = "Foo", value = "Item")

I suggest that you read the book [R for Data Science](http://r4ds.had.co.nz/)
by Garrett Grolemund and Hadley Wickham

Best wishes,
Ulrik

Dataframe Manipulation

Thread (2 messages)