How to get distinct count for last x weeks data but group by week in redshift?

You have years, months and days in seperate columns if I understand it correctly. I think the easiest way is to “build” a proper date column and then working with this column.

The following query should give you the last 6 weeks including the current week.

select 
EXTRACT(week from TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD')) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
and TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD') >= DATEADD(day,-42,DATE_TRUNC('week', sysdate)) 
GROUP BY 1
ORDER BY 1 desc

However, there might be a challenge as weeks start on a Monday in redshift, so might need to do a slight manipulation (adding one day):

select 
EXTRACT(week from DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD'))) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
and DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD')) BETWEEN DATEADD(day,-42,DATE_TRUNC('week', sysdate)) AND DATEADD(day,-1,DATE_TRUNC('week', sysdate)) 
GROUP BY 1
ORDER BY 1 desc

Debugging:

I would start running this query first, to check if the date is calculated properly

select COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD') between '2020-10-18' and '2020-10-24'

Afterwards I would see if the week is calculated correctly:

select 
EXTRACT(week from DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD'))) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD') between '2020-10-18' and '2020-10-24'
group by 1
order by 1

And last but not least I would extend the timeframe and make it dynamic:

select 
EXTRACT(week from DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD'))) week_num
,COUNT(DISTINCT(CLIENTID))
FROM process.data
where type = 'pots'
and stype= 'kites'
and tires IN ('abc', 'def', 'ghi', 'jkl')
and comp IN ('data', 'hello', 'world')
AND year = '2020'
-- this is for month october but week 43
and DATEADD(day,1,TO_DATE(year||'-'||month||'-'|| dates_for_week,'YYYY-MM-DD')) Between DATEADD(day,-42,DATE_TRUNC('week', sysdate)) and DATEADD(day,-1,DATE_TRUNC('week', sysdate)) 
group by 1
order by 1

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top