I have a table with following structure
transaction_id user_id date_column
1 1 01-08-2011
2 2 01-08-2011
3 1 02-08-2011
4 1 03-08-2011
There can be at-max only one entry for each user on each date.
How can get all rows where user_id is not present for specific date range.
So for above table with user_id= 2 and date range 01-08-2011 to 03-08-2011, I want
result
02-08-2011
03-08-2011
Right now, I am using for loop to loop over all dates in given date range.
This is working fine with small date range, but I think it will become resource heavy for large one.
As suggested in a comment, create a table with the dates of interest (I'll call it datesofinterest). Every date from your date range needs to be put into this table.
datesofinterest table
--------------
date
--------------
01-08-2011
02-08-2011
03-08-2011
Then the datesofinterest table needs to be joined with all the userids -- this is the set of all possible combinations of dates-of-interest and userids.
Now you have to remove all those dates-of-interest/userids that are currently in your original table to get your final answer.
In relational algebra, it'd be something like:
(datesofinterest[date] x transaction[user_id]) - (transaction[date_column, user_id])
This page may help with translating '-' to SQL. Generating dates to populate the datesofinterest table can be done in SQL, manually, or with a helper program (perl's DateTime)
Related
I have a table like this in SQLITE3:
I need to query this table by ID|DOC_ID|TRANS_DOC_ID and most importantly by DATE because I need to get the data day by day. ex: TODAY|YESTERDAY|ETC
So far the query is easy, as I can just do this to get the rows by day:
SELECT * FROM CLIENTRECORD WHERE DATE = '2020-12-01'
The problem is when I need to display specific records on other dates:
ex: I have a row with DATE 2020-12-01 but I also want it displayed on DATE 2020-01-01 or maybe 2020-01-02, etc. What do I do in this situation? and so I thought about adding another col as DATES which was supposed to be an array of comma-separated dates BUT I researched that this is a BAD solution, I also thought about adding a separate TABLE just for dates but since the dates aren't fixed (they might contain 1 date or maybe even 10 who knows), I am confused as to what I am supposed to do.
The end goal is that a row may or may not contain more than 1 date, would look something like this if I want to query for the row with or without multiple dates:
SELECT * FROM CLIENTRECORD WHERE DATE = '2020-12-01' OR DATES LIKE '2020-12-01'
something similar to it.
I am pretty new to SQL, but i need to use it for my new job as the project requires it and as I am a non-IT-guy, it is more difficult for me, because thats my first time I work professionally with SQL.
Hopefully you can help me with it: (Sry for my english, i am a non-native speaker)
I need to start a query where I get unequal IDs from 2 different reference dates.
So I have one Table with following data:
DATES ID AMOUNT SID
201910 122424 99999 1
201911 41241242 99999 2
201912 12412424 -22222 3
...
GOAL:
So the ID's from the DATE: 201911 shall be compared with those from 201910
and the query should show me the unequal ID's. So only the unmatched ID's shall be displayed.
Out of this query, the Amount should be summed up and grouped into SIDs.
If you have two dates and you want sids that are only on one of them, then:
select sid
from t
where date in (201911, 201910)
group by sid
having count(distinct date) = 1;
will be very grateful for any help you can provide related to the following situation.
I have 2 tables and 3rd table which is a joined table of those 2 tables.
Each table contains information on stage changes where for my calculation important are old value of stage field, new value, and date of change.
In case there is a only date1 in table 1 I use the following SQLite code
select *, case when `Duration1` = 0 then 1 else Duration1 end as "Duration1"
from
(select * ,
coalesce(JULIANDAY(`date2`) - JULIANDAY(`date1`), `report2: Stage Duration`)
as "Duration1"
from table)`
In ideal scenario table 1 contains project and 1st date (date1), table 2 contains project and 2nd date (date2). I can join them and I can get 3rd table with 1st and 2nd dates and calculate the difference between 2 dates there.
The complication pops up in cases when I have 2 dates in table 1 and 1 date in table 2. Here I need a help. I would like to add a condition in SQL code saying
if count(dates from report 1/date1)>count(dates from report 2/date2)
the difference should be (Duration I need) calculated as
today - max(JULIANDAY(`date1`))
This is my first question here. Thank you for you help and understanding in advance!
Below I have the following table structure:
CREATE TABLE StandardTable
(
RecordId varchar(50),
Balance float,
Payment float,
ProcDate date,
RecordIdCreationDate date,
-- multiple other columns used for calculations
)
And here is what a small sample of what my data might look like:
RecordId Balance Payment ProcDate RecordIdCreationDate
1 1000 100 2005-01-01 2005-01-01
2 5000 250 2008-01-01 2008-01-01
3 7500 350 2006-06-01 2006-06-01
1 900 100 2005-02-01 NULL
2 4750 250 2008-02-01 NULL
3 7150 350 2006-07-01 NULL
The table holds data on a transactional basis and has millions of rows in it. The ProcDate field indicates the month that each transaction is being processed. Regardless of when the transaction occurs throughout the month, the ProcDate field is hard coded to the first of the month that the transaction happened in. So if a transaction occurred on 2009-01-17, the ProcDate field would be 2009-01-01. I'm dealing with historical data, and it goes back to as early as 2005-01-01. There are multiple instances of each RecordId in the table. A RecordId will show up in each month until the Balance column reaches 0. Some RecordId's originate in the month the data starts (where ProcDate is 2005-01-01) and others don't originate until a later date. The RecordIdCreationDate field represents the date where the RecordId was originated. So that row has millions of NULL values in the table because every month that each RecordId didn't originate in is equal to NULL.
I need to somehow look at each RecordId, and run a number of different calculations on a month to month basis. What I mean is I have to compare column values for each RecordId where the ProcDate might be something like 2008-01-01, and compare those values to the same column values where the ProcDate would be 2008-02-01. Then after I run my calculations for the RecordId in that month, I have to compare values from 2008-02-01 to values in 2008-03-01 and run my calculations again, etc. I'm thinking that I can do this all within one big WHILE loop, but I'm not entirely sure what that would look like.
The first thing I did was create another table in my database that had the same table design as my StandardTable and I called it ProcTable. In that ProcTable, I inserted all of the data where the RecordIdCreationDate was not equal to NULL. This gave me the first instance of each RecordId in the database. I was able to run my calculations for the first month successfully, but where I'm struggling is how I use the column values in the ProcTable, and compare those to the column values where the ProcDate is the month after that. Even if I could somehow do that, I'm not sure how I would repeat that process to compare the 2nd month's data to the 3rd month's data, and the 3rd month's data to the 4th month's data, etc.
Any suggestions? Thanks in advance.
Seems to me, all you need to do is JOIN the table to itself, on this condition
ON MyTable1.RecordId = MyTable2.RecordId
AND MyTable1.ProcDate = DATEADD(month, -1, MyTable2.ProcDate)
Then you will have all the rows in your table (MyTable1), joined to the same RecordId's row from the next month (MyTable2).
And in each row you can do whatever calculations you want between the two joined tables.
I have a table with 4 columns, id, Stream which is text, Duration (int), and Timestamp (datetime). There is a row inserted for every time someone plays a specific audio stream on my website. Stream is the name, and Duration is the time in seconds that they are listening. I am currently using the following query to figure up total listen hours for each week in a year:
SELECT YEARWEEK(`Timestamp`), (SUM(`Duration`)/60/60) FROM logs_main
WHERE `Stream`="asdf" GROUP BY YEARWEEK(`Timestamp`);
This does what I expect... presenting a total of listen time for each week in the year that there is data.
However, I would like to build a query where I have a result row for weeks that there may not be any data. For example, if the 26th week of 2006 has no rows that fall within that week, then I would like the SUM result to be 0.
Is it possible to do this? Maybe via a JOIN over a date range somehow?
The tried an true old school solution is to set up another table with a bunch of date ranges that you can outer join with for the grouping (as in the other table would have all of the weeks in it with a begin / end date).
In this case, you could just get by with a table full of the values from YEARWEEK:
201100
201101
201102
201103
201104
And here is a sketch of a sql statement:
SELECT year_weeks.yearweek , (SUM(`Duration`)/60/60)
FROM year_weeks LEFT OUTER JOIN logs_main
ON year_weeks.yearweek = logs_main.YEARWEEK(`Timestamp`)
WHERE `Stream`="asdf" GROUP BY year_weeks.yearweek;
Here is a suggestion. might not be exactly what you are looking for.
But say you had a simple table with one column [year_week] that contained the values of 1, 2, 3, 4... 52
You could then theoretically:
SELECT
A.year_week,
(SELECT SUM('Duration')/60/00) FROM logs_main WHERE
stream = 'asdf' AND YEARWEEK('TimeStamp') = A.year_week GROUP BY YEARWEEK('TimeStamp'))
FROM
tblYearWeeks A
this obviously needs some tweaking... i've done several similar queries in other projects and this works well enough depending on the situation.
If your looking for a one table/sql based solution then that is deffinately something I would be interested in as well!