SQL query to count rows based on previous values of different column
I'm working in SAS and I have a table that looks like this
ID | Time | Main | lag_1 | lag_2
----------------------------------------------------------------------------
A | 01 | 0 | 0 | 1
A | 03 | 0 | 0 | 1
A | 04 | 0 | 0 | 0
A | 10 | 1 | 0 | 0
A | 11 | 1 | 0 | 0
A | 12 | 1 | 0 | 0
B | 02 | 1 | 1 | 1
B | 04 | 0 | 1 | 1
B | 07 | 0 | 0 | 1
B | 10 | 1 | 0 | 0
B | 11 | 1 | 0 | 0
B | 12 | 1 | 0 | 0
except with multiple IDs. The table is sorted by ID and Time. After calculating the total count of ones in the Main column (call it tot), I am trying to calculate 2 things:
- The total count of ones in the Main column only if lag_1 has been equal to 1 at some time before Main became 1, say tot_1; and
- The same as 1. but in this case for lag_2, call the variable tot_2
The table of expected calculations would give me that
tot | tot_1 | tot_2
--------------------
7 | 3 | 6
since tot_1 should be 3 (0 from ID = A + 3 from ID = B), and tot_2 should be 6 (3 from ID = A + 3 from ID = B).
I am a complete beginner in these types of segmentations so any help is greatly appreciated.
Edit: I would expect that tot_2 >= tot_1 because lag_2 is built on events from Main which go longer back in time than lag_1 does.
sql sas conditional
add a comment |
I'm working in SAS and I have a table that looks like this
ID | Time | Main | lag_1 | lag_2
----------------------------------------------------------------------------
A | 01 | 0 | 0 | 1
A | 03 | 0 | 0 | 1
A | 04 | 0 | 0 | 0
A | 10 | 1 | 0 | 0
A | 11 | 1 | 0 | 0
A | 12 | 1 | 0 | 0
B | 02 | 1 | 1 | 1
B | 04 | 0 | 1 | 1
B | 07 | 0 | 0 | 1
B | 10 | 1 | 0 | 0
B | 11 | 1 | 0 | 0
B | 12 | 1 | 0 | 0
except with multiple IDs. The table is sorted by ID and Time. After calculating the total count of ones in the Main column (call it tot), I am trying to calculate 2 things:
- The total count of ones in the Main column only if lag_1 has been equal to 1 at some time before Main became 1, say tot_1; and
- The same as 1. but in this case for lag_2, call the variable tot_2
The table of expected calculations would give me that
tot | tot_1 | tot_2
--------------------
7 | 3 | 6
since tot_1 should be 3 (0 from ID = A + 3 from ID = B), and tot_2 should be 6 (3 from ID = A + 3 from ID = B).
I am a complete beginner in these types of segmentations so any help is greatly appreciated.
Edit: I would expect that tot_2 >= tot_1 because lag_2 is built on events from Main which go longer back in time than lag_1 does.
sql sas conditional
1
Specify the expected result as well, when having that table data.
– jarlh
Nov 22 '18 at 15:01
Shouldn't TOT_2 by 6 and not 7? Can you really count the first observation for id=B?
– Tom
Nov 22 '18 at 16:28
Tom you are correct, I edited the post accordingly.
– Emil
Nov 22 '18 at 16:37
add a comment |
I'm working in SAS and I have a table that looks like this
ID | Time | Main | lag_1 | lag_2
----------------------------------------------------------------------------
A | 01 | 0 | 0 | 1
A | 03 | 0 | 0 | 1
A | 04 | 0 | 0 | 0
A | 10 | 1 | 0 | 0
A | 11 | 1 | 0 | 0
A | 12 | 1 | 0 | 0
B | 02 | 1 | 1 | 1
B | 04 | 0 | 1 | 1
B | 07 | 0 | 0 | 1
B | 10 | 1 | 0 | 0
B | 11 | 1 | 0 | 0
B | 12 | 1 | 0 | 0
except with multiple IDs. The table is sorted by ID and Time. After calculating the total count of ones in the Main column (call it tot), I am trying to calculate 2 things:
- The total count of ones in the Main column only if lag_1 has been equal to 1 at some time before Main became 1, say tot_1; and
- The same as 1. but in this case for lag_2, call the variable tot_2
The table of expected calculations would give me that
tot | tot_1 | tot_2
--------------------
7 | 3 | 6
since tot_1 should be 3 (0 from ID = A + 3 from ID = B), and tot_2 should be 6 (3 from ID = A + 3 from ID = B).
I am a complete beginner in these types of segmentations so any help is greatly appreciated.
Edit: I would expect that tot_2 >= tot_1 because lag_2 is built on events from Main which go longer back in time than lag_1 does.
sql sas conditional
I'm working in SAS and I have a table that looks like this
ID | Time | Main | lag_1 | lag_2
----------------------------------------------------------------------------
A | 01 | 0 | 0 | 1
A | 03 | 0 | 0 | 1
A | 04 | 0 | 0 | 0
A | 10 | 1 | 0 | 0
A | 11 | 1 | 0 | 0
A | 12 | 1 | 0 | 0
B | 02 | 1 | 1 | 1
B | 04 | 0 | 1 | 1
B | 07 | 0 | 0 | 1
B | 10 | 1 | 0 | 0
B | 11 | 1 | 0 | 0
B | 12 | 1 | 0 | 0
except with multiple IDs. The table is sorted by ID and Time. After calculating the total count of ones in the Main column (call it tot), I am trying to calculate 2 things:
- The total count of ones in the Main column only if lag_1 has been equal to 1 at some time before Main became 1, say tot_1; and
- The same as 1. but in this case for lag_2, call the variable tot_2
The table of expected calculations would give me that
tot | tot_1 | tot_2
--------------------
7 | 3 | 6
since tot_1 should be 3 (0 from ID = A + 3 from ID = B), and tot_2 should be 6 (3 from ID = A + 3 from ID = B).
I am a complete beginner in these types of segmentations so any help is greatly appreciated.
Edit: I would expect that tot_2 >= tot_1 because lag_2 is built on events from Main which go longer back in time than lag_1 does.
sql sas conditional
sql sas conditional
edited Nov 22 '18 at 16:36
Emil
asked Nov 22 '18 at 14:57
EmilEmil
10816
10816
1
Specify the expected result as well, when having that table data.
– jarlh
Nov 22 '18 at 15:01
Shouldn't TOT_2 by 6 and not 7? Can you really count the first observation for id=B?
– Tom
Nov 22 '18 at 16:28
Tom you are correct, I edited the post accordingly.
– Emil
Nov 22 '18 at 16:37
add a comment |
1
Specify the expected result as well, when having that table data.
– jarlh
Nov 22 '18 at 15:01
Shouldn't TOT_2 by 6 and not 7? Can you really count the first observation for id=B?
– Tom
Nov 22 '18 at 16:28
Tom you are correct, I edited the post accordingly.
– Emil
Nov 22 '18 at 16:37
1
1
Specify the expected result as well, when having that table data.
– jarlh
Nov 22 '18 at 15:01
Specify the expected result as well, when having that table data.
– jarlh
Nov 22 '18 at 15:01
Shouldn't TOT_2 by 6 and not 7? Can you really count the first observation for id=B?
– Tom
Nov 22 '18 at 16:28
Shouldn't TOT_2 by 6 and not 7? Can you really count the first observation for id=B?
– Tom
Nov 22 '18 at 16:28
Tom you are correct, I edited the post accordingly.
– Emil
Nov 22 '18 at 16:37
Tom you are correct, I edited the post accordingly.
– Emil
Nov 22 '18 at 16:37
add a comment |
3 Answers
3
active
oldest
votes
Much easier to do in a data step. That way you can check for start of new id and reset the flag for whether the lag_x variables were ever true.
data want ;
set have end=eof;
by id time ;
tot + main ;
if first.id then call missing(any_lag_1,any_lag_2);
if any_lag_1 then tot_1 + main ;
if any_lag_2 then tot_2 + main ;
if eof then output;
any_lag_1+lag_1;
any_lag_2+lag_2;
keep tot: ;
run;
Easier and more direct than a proc sql indeed, thank you!
– Emil
Nov 26 '18 at 9:22
add a comment |
If I understand correctly, you want these sums per id. The key is comparing the minimum value of the id under different circumstances, and then doing the sums. This is all conditional aggregation:
select sum(tot) as tot,
sum(case when id_lag_1 < id_main then tot else 0 end) as tot_1,
sum(case when id_lag_2 < id_main then tot else 0 end) as tot_2
from (select id, sum(main) as tot,
min(case when main = 1 then id end) as id_main,
min(case when lag_1 = 1 then id end) as id_lag_1,
min(case when lag_2 = 1 then id end) as id_lag_2
from t
group by id
) t;
Thank you! I applied your solution but it only works partially, i.e. works fine for tot and tot_1, but tot_2 comes out as 0, even though I expect the opposite, namely tot_2 >= tot_1.
– Emil
Nov 22 '18 at 16:42
@Emil . . . Thecaseexpressions hadelsetwice. That should bethen tot else 0.
– Gordon Linoff
Nov 23 '18 at 3:11
add a comment |
Consider the computation for tot_1 and tot_2
My first step is to look for a pattern where lag_1 > main (This fulfills the case you mentioned that,ie find records where lag_1=1 sometime before main=1) and i name all such values as 'grp_lag_1' and 'grp_lag_2'
Once i have grouped the records, i "copy" down the values using max() over(order by id,time1).
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
So i get a result as follows
+----+-------+------+-------+-------+-----------+-----------+
| id | time1 | main | lag_1 | lag_2 | grp_1 | grp_2 |
+----+-------+------+-------+-------+-----------+-----------+
| A | 01 | 0 | 0 | 1 | | grp_lag_2 |
| A | 03 | 0 | 0 | 1 | | grp_lag_2 |
| A | 04 | 0 | 0 | 0 | | grp_lag_2 |
| A | 10 | 1 | 0 | 0 | | grp_lag_2 |
| A | 11 | 1 | 0 | 0 | | grp_lag_2 |
| A | 12 | 1 | 0 | 0 | | grp_lag_2 |
| B | 02 | 1 | 1 | 1 | | |
| B | 04 | 0 | 1 | 1 | grp_lag_1 | grp_lag_2 |
| B | 07 | 0 | 0 | 1 | grp_lag_1 | grp_lag_2 |
| B | 10 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 11 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 12 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
+----+-------+------+-------+-------+-----------+-----------+
After this if i were to sumup the main values for grp_lag_1 i would get tot_1 and likewise summing up grp+lag_2 i would get tot_2
select sum(main) as tot_cnt
,sum(case when grp_1='grp_lag_1' then main end) as tot_1
,sum(case when grp_2='grp_lag_2' then main end) as tot_2
from(
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
)x
+---------+-------+-------+
| tot_cnt | tot_1 | tot_2 |
+---------+-------+-------+
| 7 | 3 | 6 |
+---------+-------+-------+
Demo
https://dbfiddle.uk/?rdbms=sqlserver_2012&fiddle=c17be111dbc3c516afa2bc3dcd3c9e1c
Unfortunately SAS does not support OVER or PARTITION in SQL.
– Reeza
Nov 22 '18 at 22:09
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53433602%2fsql-query-to-count-rows-based-on-previous-values-of-different-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Much easier to do in a data step. That way you can check for start of new id and reset the flag for whether the lag_x variables were ever true.
data want ;
set have end=eof;
by id time ;
tot + main ;
if first.id then call missing(any_lag_1,any_lag_2);
if any_lag_1 then tot_1 + main ;
if any_lag_2 then tot_2 + main ;
if eof then output;
any_lag_1+lag_1;
any_lag_2+lag_2;
keep tot: ;
run;
Easier and more direct than a proc sql indeed, thank you!
– Emil
Nov 26 '18 at 9:22
add a comment |
Much easier to do in a data step. That way you can check for start of new id and reset the flag for whether the lag_x variables were ever true.
data want ;
set have end=eof;
by id time ;
tot + main ;
if first.id then call missing(any_lag_1,any_lag_2);
if any_lag_1 then tot_1 + main ;
if any_lag_2 then tot_2 + main ;
if eof then output;
any_lag_1+lag_1;
any_lag_2+lag_2;
keep tot: ;
run;
Easier and more direct than a proc sql indeed, thank you!
– Emil
Nov 26 '18 at 9:22
add a comment |
Much easier to do in a data step. That way you can check for start of new id and reset the flag for whether the lag_x variables were ever true.
data want ;
set have end=eof;
by id time ;
tot + main ;
if first.id then call missing(any_lag_1,any_lag_2);
if any_lag_1 then tot_1 + main ;
if any_lag_2 then tot_2 + main ;
if eof then output;
any_lag_1+lag_1;
any_lag_2+lag_2;
keep tot: ;
run;
Much easier to do in a data step. That way you can check for start of new id and reset the flag for whether the lag_x variables were ever true.
data want ;
set have end=eof;
by id time ;
tot + main ;
if first.id then call missing(any_lag_1,any_lag_2);
if any_lag_1 then tot_1 + main ;
if any_lag_2 then tot_2 + main ;
if eof then output;
any_lag_1+lag_1;
any_lag_2+lag_2;
keep tot: ;
run;
answered Nov 22 '18 at 17:32
TomTom
22.8k2718
22.8k2718
Easier and more direct than a proc sql indeed, thank you!
– Emil
Nov 26 '18 at 9:22
add a comment |
Easier and more direct than a proc sql indeed, thank you!
– Emil
Nov 26 '18 at 9:22
Easier and more direct than a proc sql indeed, thank you!
– Emil
Nov 26 '18 at 9:22
Easier and more direct than a proc sql indeed, thank you!
– Emil
Nov 26 '18 at 9:22
add a comment |
If I understand correctly, you want these sums per id. The key is comparing the minimum value of the id under different circumstances, and then doing the sums. This is all conditional aggregation:
select sum(tot) as tot,
sum(case when id_lag_1 < id_main then tot else 0 end) as tot_1,
sum(case when id_lag_2 < id_main then tot else 0 end) as tot_2
from (select id, sum(main) as tot,
min(case when main = 1 then id end) as id_main,
min(case when lag_1 = 1 then id end) as id_lag_1,
min(case when lag_2 = 1 then id end) as id_lag_2
from t
group by id
) t;
Thank you! I applied your solution but it only works partially, i.e. works fine for tot and tot_1, but tot_2 comes out as 0, even though I expect the opposite, namely tot_2 >= tot_1.
– Emil
Nov 22 '18 at 16:42
@Emil . . . Thecaseexpressions hadelsetwice. That should bethen tot else 0.
– Gordon Linoff
Nov 23 '18 at 3:11
add a comment |
If I understand correctly, you want these sums per id. The key is comparing the minimum value of the id under different circumstances, and then doing the sums. This is all conditional aggregation:
select sum(tot) as tot,
sum(case when id_lag_1 < id_main then tot else 0 end) as tot_1,
sum(case when id_lag_2 < id_main then tot else 0 end) as tot_2
from (select id, sum(main) as tot,
min(case when main = 1 then id end) as id_main,
min(case when lag_1 = 1 then id end) as id_lag_1,
min(case when lag_2 = 1 then id end) as id_lag_2
from t
group by id
) t;
Thank you! I applied your solution but it only works partially, i.e. works fine for tot and tot_1, but tot_2 comes out as 0, even though I expect the opposite, namely tot_2 >= tot_1.
– Emil
Nov 22 '18 at 16:42
@Emil . . . Thecaseexpressions hadelsetwice. That should bethen tot else 0.
– Gordon Linoff
Nov 23 '18 at 3:11
add a comment |
If I understand correctly, you want these sums per id. The key is comparing the minimum value of the id under different circumstances, and then doing the sums. This is all conditional aggregation:
select sum(tot) as tot,
sum(case when id_lag_1 < id_main then tot else 0 end) as tot_1,
sum(case when id_lag_2 < id_main then tot else 0 end) as tot_2
from (select id, sum(main) as tot,
min(case when main = 1 then id end) as id_main,
min(case when lag_1 = 1 then id end) as id_lag_1,
min(case when lag_2 = 1 then id end) as id_lag_2
from t
group by id
) t;
If I understand correctly, you want these sums per id. The key is comparing the minimum value of the id under different circumstances, and then doing the sums. This is all conditional aggregation:
select sum(tot) as tot,
sum(case when id_lag_1 < id_main then tot else 0 end) as tot_1,
sum(case when id_lag_2 < id_main then tot else 0 end) as tot_2
from (select id, sum(main) as tot,
min(case when main = 1 then id end) as id_main,
min(case when lag_1 = 1 then id end) as id_lag_1,
min(case when lag_2 = 1 then id end) as id_lag_2
from t
group by id
) t;
edited Nov 23 '18 at 3:10
answered Nov 22 '18 at 15:23
Gordon LinoffGordon Linoff
765k35297401
765k35297401
Thank you! I applied your solution but it only works partially, i.e. works fine for tot and tot_1, but tot_2 comes out as 0, even though I expect the opposite, namely tot_2 >= tot_1.
– Emil
Nov 22 '18 at 16:42
@Emil . . . Thecaseexpressions hadelsetwice. That should bethen tot else 0.
– Gordon Linoff
Nov 23 '18 at 3:11
add a comment |
Thank you! I applied your solution but it only works partially, i.e. works fine for tot and tot_1, but tot_2 comes out as 0, even though I expect the opposite, namely tot_2 >= tot_1.
– Emil
Nov 22 '18 at 16:42
@Emil . . . Thecaseexpressions hadelsetwice. That should bethen tot else 0.
– Gordon Linoff
Nov 23 '18 at 3:11
Thank you! I applied your solution but it only works partially, i.e. works fine for tot and tot_1, but tot_2 comes out as 0, even though I expect the opposite, namely tot_2 >= tot_1.
– Emil
Nov 22 '18 at 16:42
Thank you! I applied your solution but it only works partially, i.e. works fine for tot and tot_1, but tot_2 comes out as 0, even though I expect the opposite, namely tot_2 >= tot_1.
– Emil
Nov 22 '18 at 16:42
@Emil . . . The
case expressions had else twice. That should be then tot else 0.– Gordon Linoff
Nov 23 '18 at 3:11
@Emil . . . The
case expressions had else twice. That should be then tot else 0.– Gordon Linoff
Nov 23 '18 at 3:11
add a comment |
Consider the computation for tot_1 and tot_2
My first step is to look for a pattern where lag_1 > main (This fulfills the case you mentioned that,ie find records where lag_1=1 sometime before main=1) and i name all such values as 'grp_lag_1' and 'grp_lag_2'
Once i have grouped the records, i "copy" down the values using max() over(order by id,time1).
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
So i get a result as follows
+----+-------+------+-------+-------+-----------+-----------+
| id | time1 | main | lag_1 | lag_2 | grp_1 | grp_2 |
+----+-------+------+-------+-------+-----------+-----------+
| A | 01 | 0 | 0 | 1 | | grp_lag_2 |
| A | 03 | 0 | 0 | 1 | | grp_lag_2 |
| A | 04 | 0 | 0 | 0 | | grp_lag_2 |
| A | 10 | 1 | 0 | 0 | | grp_lag_2 |
| A | 11 | 1 | 0 | 0 | | grp_lag_2 |
| A | 12 | 1 | 0 | 0 | | grp_lag_2 |
| B | 02 | 1 | 1 | 1 | | |
| B | 04 | 0 | 1 | 1 | grp_lag_1 | grp_lag_2 |
| B | 07 | 0 | 0 | 1 | grp_lag_1 | grp_lag_2 |
| B | 10 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 11 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 12 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
+----+-------+------+-------+-------+-----------+-----------+
After this if i were to sumup the main values for grp_lag_1 i would get tot_1 and likewise summing up grp+lag_2 i would get tot_2
select sum(main) as tot_cnt
,sum(case when grp_1='grp_lag_1' then main end) as tot_1
,sum(case when grp_2='grp_lag_2' then main end) as tot_2
from(
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
)x
+---------+-------+-------+
| tot_cnt | tot_1 | tot_2 |
+---------+-------+-------+
| 7 | 3 | 6 |
+---------+-------+-------+
Demo
https://dbfiddle.uk/?rdbms=sqlserver_2012&fiddle=c17be111dbc3c516afa2bc3dcd3c9e1c
Unfortunately SAS does not support OVER or PARTITION in SQL.
– Reeza
Nov 22 '18 at 22:09
add a comment |
Consider the computation for tot_1 and tot_2
My first step is to look for a pattern where lag_1 > main (This fulfills the case you mentioned that,ie find records where lag_1=1 sometime before main=1) and i name all such values as 'grp_lag_1' and 'grp_lag_2'
Once i have grouped the records, i "copy" down the values using max() over(order by id,time1).
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
So i get a result as follows
+----+-------+------+-------+-------+-----------+-----------+
| id | time1 | main | lag_1 | lag_2 | grp_1 | grp_2 |
+----+-------+------+-------+-------+-----------+-----------+
| A | 01 | 0 | 0 | 1 | | grp_lag_2 |
| A | 03 | 0 | 0 | 1 | | grp_lag_2 |
| A | 04 | 0 | 0 | 0 | | grp_lag_2 |
| A | 10 | 1 | 0 | 0 | | grp_lag_2 |
| A | 11 | 1 | 0 | 0 | | grp_lag_2 |
| A | 12 | 1 | 0 | 0 | | grp_lag_2 |
| B | 02 | 1 | 1 | 1 | | |
| B | 04 | 0 | 1 | 1 | grp_lag_1 | grp_lag_2 |
| B | 07 | 0 | 0 | 1 | grp_lag_1 | grp_lag_2 |
| B | 10 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 11 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 12 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
+----+-------+------+-------+-------+-----------+-----------+
After this if i were to sumup the main values for grp_lag_1 i would get tot_1 and likewise summing up grp+lag_2 i would get tot_2
select sum(main) as tot_cnt
,sum(case when grp_1='grp_lag_1' then main end) as tot_1
,sum(case when grp_2='grp_lag_2' then main end) as tot_2
from(
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
)x
+---------+-------+-------+
| tot_cnt | tot_1 | tot_2 |
+---------+-------+-------+
| 7 | 3 | 6 |
+---------+-------+-------+
Demo
https://dbfiddle.uk/?rdbms=sqlserver_2012&fiddle=c17be111dbc3c516afa2bc3dcd3c9e1c
Unfortunately SAS does not support OVER or PARTITION in SQL.
– Reeza
Nov 22 '18 at 22:09
add a comment |
Consider the computation for tot_1 and tot_2
My first step is to look for a pattern where lag_1 > main (This fulfills the case you mentioned that,ie find records where lag_1=1 sometime before main=1) and i name all such values as 'grp_lag_1' and 'grp_lag_2'
Once i have grouped the records, i "copy" down the values using max() over(order by id,time1).
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
So i get a result as follows
+----+-------+------+-------+-------+-----------+-----------+
| id | time1 | main | lag_1 | lag_2 | grp_1 | grp_2 |
+----+-------+------+-------+-------+-----------+-----------+
| A | 01 | 0 | 0 | 1 | | grp_lag_2 |
| A | 03 | 0 | 0 | 1 | | grp_lag_2 |
| A | 04 | 0 | 0 | 0 | | grp_lag_2 |
| A | 10 | 1 | 0 | 0 | | grp_lag_2 |
| A | 11 | 1 | 0 | 0 | | grp_lag_2 |
| A | 12 | 1 | 0 | 0 | | grp_lag_2 |
| B | 02 | 1 | 1 | 1 | | |
| B | 04 | 0 | 1 | 1 | grp_lag_1 | grp_lag_2 |
| B | 07 | 0 | 0 | 1 | grp_lag_1 | grp_lag_2 |
| B | 10 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 11 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 12 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
+----+-------+------+-------+-------+-----------+-----------+
After this if i were to sumup the main values for grp_lag_1 i would get tot_1 and likewise summing up grp+lag_2 i would get tot_2
select sum(main) as tot_cnt
,sum(case when grp_1='grp_lag_1' then main end) as tot_1
,sum(case when grp_2='grp_lag_2' then main end) as tot_2
from(
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
)x
+---------+-------+-------+
| tot_cnt | tot_1 | tot_2 |
+---------+-------+-------+
| 7 | 3 | 6 |
+---------+-------+-------+
Demo
https://dbfiddle.uk/?rdbms=sqlserver_2012&fiddle=c17be111dbc3c516afa2bc3dcd3c9e1c
Consider the computation for tot_1 and tot_2
My first step is to look for a pattern where lag_1 > main (This fulfills the case you mentioned that,ie find records where lag_1=1 sometime before main=1) and i name all such values as 'grp_lag_1' and 'grp_lag_2'
Once i have grouped the records, i "copy" down the values using max() over(order by id,time1).
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
So i get a result as follows
+----+-------+------+-------+-------+-----------+-----------+
| id | time1 | main | lag_1 | lag_2 | grp_1 | grp_2 |
+----+-------+------+-------+-------+-----------+-----------+
| A | 01 | 0 | 0 | 1 | | grp_lag_2 |
| A | 03 | 0 | 0 | 1 | | grp_lag_2 |
| A | 04 | 0 | 0 | 0 | | grp_lag_2 |
| A | 10 | 1 | 0 | 0 | | grp_lag_2 |
| A | 11 | 1 | 0 | 0 | | grp_lag_2 |
| A | 12 | 1 | 0 | 0 | | grp_lag_2 |
| B | 02 | 1 | 1 | 1 | | |
| B | 04 | 0 | 1 | 1 | grp_lag_1 | grp_lag_2 |
| B | 07 | 0 | 0 | 1 | grp_lag_1 | grp_lag_2 |
| B | 10 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 11 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
| B | 12 | 1 | 0 | 0 | grp_lag_1 | grp_lag_2 |
+----+-------+------+-------+-------+-----------+-----------+
After this if i were to sumup the main values for grp_lag_1 i would get tot_1 and likewise summing up grp+lag_2 i would get tot_2
select sum(main) as tot_cnt
,sum(case when grp_1='grp_lag_1' then main end) as tot_1
,sum(case when grp_2='grp_lag_2' then main end) as tot_2
from(
select *
,max(case when lag_1 > main then 'grp_lag_1' end) over(partition by id order by id,time1) as grp_1
,max(case when lag_2 > main then 'grp_lag_2' end) over(partition by id order by id,time1) as grp_2
from t
)x
+---------+-------+-------+
| tot_cnt | tot_1 | tot_2 |
+---------+-------+-------+
| 7 | 3 | 6 |
+---------+-------+-------+
Demo
https://dbfiddle.uk/?rdbms=sqlserver_2012&fiddle=c17be111dbc3c516afa2bc3dcd3c9e1c
answered Nov 22 '18 at 17:24
George JosephGeorge Joseph
1,44249
1,44249
Unfortunately SAS does not support OVER or PARTITION in SQL.
– Reeza
Nov 22 '18 at 22:09
add a comment |
Unfortunately SAS does not support OVER or PARTITION in SQL.
– Reeza
Nov 22 '18 at 22:09
Unfortunately SAS does not support OVER or PARTITION in SQL.
– Reeza
Nov 22 '18 at 22:09
Unfortunately SAS does not support OVER or PARTITION in SQL.
– Reeza
Nov 22 '18 at 22:09
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53433602%2fsql-query-to-count-rows-based-on-previous-values-of-different-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Specify the expected result as well, when having that table data.
– jarlh
Nov 22 '18 at 15:01
Shouldn't TOT_2 by 6 and not 7? Can you really count the first observation for id=B?
– Tom
Nov 22 '18 at 16:28
Tom you are correct, I edited the post accordingly.
– Emil
Nov 22 '18 at 16:37