Skip to content
Back to formatted view

Raw Message

Message-ID: <34d9b5e4-6d84-42b0-bb98-d652317bc00f@syonic.eu>
Date: 2023-10-16T11:41:48Z
From: Leonard Mada
Subject: Create new data frame with conditional sums
In-Reply-To: <127afb3d-e376-4acf-9057-fc84b37e792a@syonic.eu>

Dear Jason,

The code could look something like:

dummyData = data.frame(Tract=seq(1, 10, by=1),
 ?? ?Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
 ?? ?Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))

# Define the cutoffs
# - allow for duplicate entries;
by = 0.03; # by = 0.01;
cutoffs <- seq(0, 0.20, by = by)

# Create a new column with cutoffs
dummyData$Cutoff <- cut(dummyData$Pct, breaks = cutoffs,
 ?? ?labels = cutoffs[-1], ordered_result = TRUE)

# Sort data
# - we could actually order only the columns:
#?? Totpop & Cutoff;
dummyData = dummyData[order(dummyData$Cutoff), ]

# Result
cs = cumsum(dummyData$Totpop)

# Only last entry:
# - I do not have a nice one-liner, but this should do it:
isLast = rev(! duplicated(rev(dummyData$Cutoff)))

data.frame(Total = cs[isLast],
 ?? ?Cutoff = dummyData$Cutoff[isLast])


Sincerely,

Leonard


On 10/15/2023 7:41 PM, Leonard Mada wrote:
> Dear Jason,
>
>
> I do not think that the solution based on aggregate offered by GPT was 
> correct. That quasi-solution only aggregates for every individual level.
>
>
> As I understand, you want the cumulative sum. The idea was proposed by 
> Bert; you need only to sort first based on the cutoff (e.g. using an 
> ordered factor). And then only extract the last value for each level. 
> If Pct is unique, than you can skip this last step and use directly 
> the cumsum (but on the sorted data set).
>
>
> Alternatives: see the solutions with loops or with sapply.
>
>
> Sincerely,
>
>
> Leonard
>
>