Get list of accounts with last know info - sql

I'm trying to achieve following on SQL Server.
On a weekly basis I'll be reading accounts from a database.
On 1/07/2018 I have read 5 accounts with their corresponding color.
A week later I only read 2 accounts because only 2 accounts have changed since then (accountno 10001 and 10004).
Another week later again I only read 2 accounts (10004 and 10005)
As a result I want to achieve the right side.
For each run I want to view the states of the different accounts.
Since I only re-insert changed data (and thus discard records which didn't changed compared to last time) I need to make sure that the unchanged records are also present in the result by search for the last known state of that accountno.
In the end I need a table a table with 15 records (5 accounts for 3 different dates)
Could anyone help me with this, because I can't get my head around this one.

You could use the row_number window function to just get the last entry for each account:
SELECT accountno, color, datemod
FROM (SELECT accountno, color, datemod,
ROW_NUMBER() OVER (PARTITION BY accountno ORDER BY datemod DESC) AS rn
FROM mytable) t
WHERE rn = 1

Related

Avoid double counting - only count first occurrence in table

I am trying to do a count by month of the total number of items (serialnumber) that appears in inventory.
This usually can be easily solved with distinct, however, I only want to count if it is the first occurrence that it appears (first insert).
This query gets me most of the way there.
select date_trunc (‘month’,date) as Date,productid, count(distinct serialnumber) from inventory
where date_trunc(‘month’,date)>= ‘2016-01-01’ and productID in ('1','2') and status = ‘INSERT’
group by date_trunc(‘month’,date), productid
order by date_trunc(‘month’,date) desc
But I realize I am double/triple/quadruple counting some serial numbers because an item can reappear in our inventory multiple times over the course of its lifecycle.
The query above covers these scenarios since the serial numbers appear once:
Shows up as new
Shows up as used
Below are the use cases where I realize I may be double/triple/quadruple counting:
Shows up as new then comes back around as used (no limit to how many times it can appear used)
Shows up used then comes back again as used (no limit to how many times it can appear used)
Here's an example I ran into.
(Note: I have added the condition column to better illustrate this). But the particular serial number has been in inventory three times (first as new, then as used twice)
Date
ProductID
Count
Condition
7-1-21
1
1
u
11-1-18
1
1
u
2-1-17
1
1
n
In my current query results, each insert gets counted (once in Feb 2017, once in Nov 2018 and once in July 2021).
How can I amend my query to make sure I'm only counting the very first instance (insert) a particular serial number appears in the inventory table?
In the subquery calculate first insert date only of each product/item using min aggregate function. Then count the items on that result:
select Date, productid, count(serialnumber)
from (
select min(date_trunc(‘month’,date)) as Date, productid, serialnumber
from inventory
where date_trunc(‘month’,date) >= ‘2016-01-01’
and productID in ('1','2')
and status = ‘INSERT’
group by productid, serialnumber
) x
group by Date, productid
order by Date desc;

SQL calculating running total as you go down the rows but also taking other fields into account

I'm hoping you guys can help with this problem.
I have a set of data which I have displayed via excel.
I'm trying to work out the rolling new cap allowance but need to deduct from previous weeks bookings. I don't want to use a cursor so can anyone help.
I'm going to group by the product id so it will need to start afresh for every product.
In the image, Columns A to D are fixed and I am trying to calculate the data in column E ('New Cap'). The 'New Cap' is the expected results.
Column F gives a detailed formula of what im trying to do.
Not sure what I've done for the post to be marked down.
Thanks
Update:
The formula looks like this.
You want the sum of the cap through this row minus the sum of booked through the previous row. This is easy to do with window functions:
select t.*,
(sum(cap + booked) over (partition by productid order by weekbeg) - booked
) as new_cap
from t;
You can get the new running total using lag and sum over window functions - calculate the cap-booked first, then use sum over() for the running total:
select weekbeg, ProductId, Cap, Booked,
Sum(n) over(partition by productid order by weekbeg) New_Cap
from (
select *, cap - Lag(booked,1,0) over(partition by productid order by weekbeg)n
from t
)t

Use SQL to ensure I have data for each day of a certain time period

I'm looking to only select one data point from each date in my report. I want to ensure each day is accounted for and has at least one row of information, as we had to do a few different things to move a large data file into our data warehouse (import one large Google Sheet for some data, use Python for daily pulls of some of the other data - want to make sure no date was left out), and this data goes from now through last summer. I could do a COUNT DISTINCT clause to just make sure the number of days between the first data point and yesterday (the latest data point), but I want to verify each day is accounted for. Should mention I am in BigQuery. Also, an example of the created_at style is: 2021-02-09 17:05:44.583 UTC
This is what I have so far:
SELECT FIRST(created_at)
FROM 'large_table'
ORDER BY created_at
**I know FIRST is probably not the best clause for this case, and it's currently acting to grab the very first data point in created_at, but just as a jumping-off point.
You can use aggregation:
select any_value(lt).*
from large_table lt
group by created_at
order by min(created_at);
Note: This assumes that created_at is a date -- or at least only has one value per date. You might need to convert it to a date:
select any_value(lt).*
from large_table lt
group by date(created_at)
order by min(created_at);
BigQuery equivalent of the query in your question
SELECT created_at
FROM 'large_table'
ORDER BY created_at
LIMIT 1

Group or Sum the data based on overlapping period

I'm working on migrating legacy system data to a new system. I'm trying to migrate the data with history based on changed date. My current query results to below output.
Since it's a legacy system, some of the data falls within same period. I want to group the data based on id and name, and add the value as active record or inactive based on the data falls under same period.
My expected output:
For example, lets take 119 as an example and explain the same. One row marked as yellow since its not falls any overlapping period between other rows, but other two rows overlaps the period 01-No-18 to 30-Sep-19.
I need to split the data for overlapping period, and add the value only for overlapped period. So I need to look for combination based on date, which results to introduce a two rows one for non overlapped which results to below two rows
Another row for overlapped row
Same scenario applied for 148324, two rows introduced, one for overlapped and another non overlapped row.
Also is it possible to get non-overlapped data alone based on any condition ? I want to move overlapping data alone to temp table, and I can move the non-overlapped data directly to output table.
I think I dont have 100% solution, but its hard to decision what data are right and how them sort.
This query is based on lead/lag analytic functions. I had to change NULL values to adequate values in sequence (future and past).
Please try and modify this query and I hope it will fit in your case.
My table:
Query:
SELECT id,name,value,startdate,enddate,
CASE WHEN nvl(next_startdate,29993112)>nvl(prev_enddate,19900101) THEN 'Y' ELSE 'N' END AS active
FROM
(
SELECT datatable.*,
lag(enddate) over (partition by id,name order by startdate,value desc) prev_enddate,
lead(startdate) over (partition by id,name order by startdate,value desc) next_startdate
FROM datatable
) dt
Results:

Is there a way to distinct more than 1 field

I need a report that has office, date and order count. I need the total count of orders per month, but only 1 order count per day.
e.g.
West 1/1/2009 1 order
West 1/1/2009 1 order
West 1/2/2009 1 order
on my report I would see
West 1/1/2009 1 order
West 1/2/2009 1 order
and my total orders would be 2.
This would be really easy with SQL, I know, but I do not have access.
Are you just looking for this?
SELECT DISTINCT Office, Date, OrderCount FROM YourTable
This would duplicate your results, but the data set is too small to know for sure if this is what you're trying to accomplish. Using the DISTINCT clause would return only unique combinations of Office, Date, and OrderCount - in this case, one line per day/office.
UPDATE: Ah - I didn't read the part where you don't have SQL access. You still have two choices:
In Crystal Reports Designer, in the "Database" menu, check the "Select Distinct Records" option at the bottom of the menu.
Edit the SQL query directly - Database menu -> Database Expert -> Under "Current Connections", click "Add new command" and type your SQL command. Modify the one I provided above to meet your needs, and it should do the trick.
You can create three groups, one for office, one for date, and one for order. Then put the fields in the day group footer and suppress the other sections. This will cause the report to show a new section for each day, but only show one row for each order. Then you can add your running total to the section. Set the running total up to sum the field you want, evaluate on change of day group and then reset on change of month (you'll need to set a formula up for this one to evaluate the month).
This should group and order the report like you are looking for and will have a running total that will run along side which will reset per month. Hope this helps.