Wednesday, March 21, 2018

r - Cumulative sum based on certain conditions

Here is a dplyr-based answer using the same logic as @Floo0. This will tend to get slow as you have a larger number of groups.




First, I added the row numbers as a column to the original dataset. The calculation of CumSumB will be done for each unique row using this approach.



library(dplyr)

dat = dat %>% mutate(row = row_number())


Then I join the dataset to itself, joining X to Y and by Date. To avoid many duplicate columns with added suffixes, I selected only some of the columns for the x dataset of the join (i.e., first dataset of left_join).




I kept the variable row in both datasets on purpose, so I end up with a variable called row.x that indicates the original row number of each X value and a variable called row.y indicating the original row number of each Y value.



dat %>% 
left_join(select(dat, X, Date, Y, row), ., by = c("X" = "Y", "Date" = "Date"))


Once that is done, the dataset just needs to be grouped by row.x and the sum of Qty calculated conditional on row.x being less than or equal to row.y.



dat %>% 
left_join(select(dat, X, Date, Y, row), ., by = c("X" = "Y", "Date" = "Date")) %>%

group_by(row.x) %>%
summarise(CumSumB = sum(Qty[row.y <= row.x]))


Last, this can be joined back to the original dataset. The result still contains a column representing the row number, which could be removed via select(-row) if needed.



dat %>% 
left_join(select(dat, X, Date, Y, row), ., by = c("X" = "Y", "Date" = "Date")) %>%
group_by(row.x) %>%
summarise(CumSumB = sum(Qty[row.y <= row.x])) %>%

left_join(dat, ., by = c("row" = "row.x"))

X Y Date Qty CumSumA CumSumB.x row CumSumB.y
1 A B 1/1 1 1 0 1 0
2 A A 1/1 2 3 2 2 2
3 A E 1/1 2 5 2 3 2
4 B A 1/1 1 1 1 4 1
5 B B 1/1 3 4 4 5 4
6 B C 1/1 2 6 4 6 4
7 C D 1/1 2 2 2 7 2

8 C E 1/1 4 6 2 8 2
9 C A 1/1 1 7 2 9 2
10 A C 1/2 2 2 0 10 0
11 A D 1/2 3 5 0 11 0
12 A E 1/2 2 7 0 12 0
13 B A 1/2 5 5 0 13 0
14 B B 1/2 1 6 1 14 1
15 B C 1/2 2 8 1 15 1
16 C D 1/2 2 2 4 16 4
17 C E 1/2 1 1 4 17 4

18 C A 1/2 3 4 4 18 4

No comments:

Post a Comment

plot explanation - Why did Peaches&#39; mom hang on the tree? - Movies &amp; TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...