SQL: Data Value Revision - sql

I'm trying to pull any ice cream flavors that have a revised flavor_count in our database table. The end result should look something like this:.
Ice_cream_table
I have been working on this code that basically returns all rows (over a million in are DB) incorrectly, and provides the wrong date timestamp:
create table want as (
with temp_tbl as (
select region,report,customer,date,ice_cream_flavor,flavor_count,Order_status,Change_dt
from ice_cream_table
where customer='12345' and Change_dt > to_date(%STR(%'&last_request%'),'DD-MON-YYYY:HH24:MI:SS') and region='south')
select b.region,b.report, b.customer, b.date, b.ice_cream_flavor,
a.flavor_count as "Previous Value",
b.flavor_count as "Current Value",
a.Change_dt as "Previous Date"
from (
select region, report, customer, date, ice_cream_flavor, flavor_count
from temp_tbl
where flavor_status='ACTIVE'
order by ice_cream_flavor
) b inner join
(
select region, report, customer, date, ice_cream_flavor, flavor_count, max(Change_dt) as Change_dt
from temp_tbl
where (flavor_status='Deleted' or flavor_status='Discontinued')
group by region, report, customer, date, ice_cream_flavor, flavor_count
order by ice_cream_flavor) a
on b.ice_cream_flavor=a.ice_cream_flavor
where b.flavor_count<>a.flavor_count /*doing this to provide only changes, not all ice_cream_flavors */
);
quit;
Our data is incremental, meaning revisions become 'discontinued' rows and the latest ice_cream_count is 'active'.
Is there a better way to do this or am I just missing something obvious in the code? Sorry for the long post!

Related

SQL Server: Generate squashed range data from daily dates by an id

Basically to sum up the goal. I want to generate something using the following data. Where it subtotals like excel where "each change in ... ordered however" generate summary records and only generate summary records and squash date ranges.
All I want are the count records would generate rows and would copy the fee data except for the begin/end fields would be the ranged dates. So first record would be fee 9858 from 3/31 - 4/14 for example. Each count row would be a new fee record.
I am not sure which combination of grouping, partition by.... and such to get what I need or if there is something else I can use. I can provide sql if needed but I am mainly looking for finding the right combination of tools(partition by, grouping, rollup....) that would provide this functionality.
WITH [ag]
AS (SELECT *,
LAG([Fee_ID]) OVER (ORDER BY [FeeTypeID], [FeeBeginDate]) [FirstFee],
LAG([Fee_ID]) OVER (ORDER BY [FeeTypeID], [FeeBeginDate] DESC) [LastFee]
FROM [dbo].[HHTFees]
WHERE [Retailer] = 517),
[agf]
AS (SELECT *,
'Beg' [FeeStopType]
FROM [ag]
WHERE [ag].[FirstFee] <> [ag].[Fee_ID]
OR [ag].[FirstFee] IS NULL),
[agl]
AS (SELECT *,
'End' [FeeStopType]
FROM [ag]
WHERE [ag].[LastFee] <> [ag].[Fee_ID]
OR [ag].[LastFee] IS NULL),
[results]
AS (SELECT *
FROM [agf]
UNION
SELECT *
FROM [agl]),
[indexed]
AS (SELECT *,
ROW_NUMBER() OVER (ORDER BY [results].[FeeTypeID], [results].[FeeBeginDate]) [RowNum]
FROM [results])
SELECT [Starts].[Retailer],
[Starts].[Chain_key],
[Starts].[Fee_ID],
[Starts].[FeeTypeID],
[Starts].[FeeChgTypeID],
[Starts].[Fee],
[Starts].[FeeDescription],
[Starts].[FeeTypeDescription],
[Starts].[FeeBeginDate],
[Ends].[FeeEndDate],
[Starts].[FeeAmt],
[Starts].[HHT_ID],
[Starts].[CreatedBy],
[Starts].[CreatedDate],
[Starts].[ModifiedBy],
[Starts].[ModifiedDate],
[Starts].[MsgID],
[Starts].[ScopeOrder],
[Starts].[Scope],
[Starts].[FirstFee],
[Starts].[LastFee],
[Starts].[FeeStopType],
[Starts].[RowNum]
FROM [indexed] [Starts]
INNER JOIN [indexed] [Ends]
ON [Starts].[Fee_ID] = [Ends].[Fee_ID]
AND [Ends].[RowNum] = [Starts].[RowNum] + 1;
I used lag to find where the fee id changed sorted by feetype / start date. This allowed me to mimic when fee id changed sorted by begin date like excel subtotal. Now to optimize and tweak to fit the set of data.

Select only the row with the max value, but the column with this info is a SUM()

I have the following query:
SELECT DISTINCT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT
FROM TGFCAB CAB, TGFPAR PAR, TSIBAI BAI
WHERE CAB.CODPARC = PAR.CODPARC
AND PAR.CODBAI = BAI.CODBAI
AND CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI
Which the result is this. Company names and neighborhood hid for obvious reasons
The query at the moment, for those who don't understand Latin languages, is giving me clients, company name, company neighborhood, and the total value of movements.
in the WHERE clause it is only filtering sales movements of companies from an established city.
But if you notice in the Select statement, the column that is retuning the value that aggregates the total amount of value of sales is a SUM().
My goal is to return only the company that have the maximum value of this column, if its a tie, display both of em.
This is where i'm struggling, cause i can't seem to find a simple solution. I tried to use
WHERE AMOUNT = MAX(AMOUNT)
But as expected it didn't work
You tagged the question with the whole bunch of different databases; do you really use all of them?
Because, "PL/SQL" reads as "Oracle". If that's so, here's one option.
with temp as
-- this is your current query
(select columns,
sum(vrlnota) as amount
from ...
where ...
)
-- query that returns what you asked for
select *
from temp t
where t.amount = (select max(a.amount)
from temp a
);
You should be able to achieve the same without the need for a subquery using window over() function,
WITH T AS (
SELECT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT,
MAX(VLRNOTA) over() AS MAMOUNT
FROM TGFCAB CAB
JOIN TGFPAR PAR ON PAR.CODPARC = CAB.CODPARC
JOIN TSIBAI BAI ON BAI.CODBAI = PAR.CODBAI
WHERE CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY CAB.CODPARC, PAR.RAZAOSOCIAL, BAI.NOMEBAI
)
SELECT CODPARC, RAZAOSOCIAL, NOMEBAI, AMOUNT
FROM T
WHERE AMOUNT=MAMOUNT
Note it's usually (always) beneficial to join tables using clear explicit join syntax. This should be fine cross-platform between Oracle & SQL Server.

Postgresql query for every day sold stock count

I have project on CRM which maintains product sales order for every organization.
I want to count everyday sold stock which I have managed to do by looping over by date but obviously it is a ridiculous method and taking more time and memory.
Please help me to find out it in single query. Is it possible?
Here is my database structure for your reference.
product : id (PK), name
organization : id (PK), name
sales_order : id (PK), product_id (FK), organization_id (FK), sold_stock, sold_date(epoch time)
Expected Output for selected month :
organization | product | day1_sold_stock | day2_sold_stock | ..... | day30_sold_stock
http://sqlfiddle.com/#!15/e1dc3/3
Create tablfunc :
CREATE EXTENSION IF NOT EXISTS tablefunc;
Query :
select "proId" as ProductId ,product_name as ProductName,organizationName as OrganizationName,
coalesce( "1-day",0) as "1-day" ,coalesce( "2-day",0) as "2-day" ,coalesce( "3-day",0) as "3-day" ,
coalesce( "4-day",0) as "4-day" ,coalesce( "5-day",0) as "5-day" ,coalesce( "6-day",0) as "6-day" ,
coalesce( "7-day",0) as "7-day" ,coalesce( "8-day",0) as "8-day" ,coalesce( "9-day",0) as "9-day" ,
coalesce("10-day",0) as "10-day" ,coalesce("11-day",0) as "11-day" ,coalesce("12-day",0) as "12-day" ,
coalesce("13-day",0) as "13-day" ,coalesce("14-day",0) as "14-day" ,coalesce("15-day",0) as"15-day" ,
coalesce("16-day",0) as "16-day" ,coalesce("17-day",0) as "17-day" ,coalesce("18-day",0) as "18-day" ,
coalesce("19-day",0) as "19-day" ,coalesce("20-day",0) as "20-day" ,coalesce("21-day",0) as"21-day" ,
coalesce("22-day",0) as "22-day" ,coalesce("23-day",0) as "23-day" ,coalesce("24-day",0) as "24-day" ,
coalesce("25-day",0) as "25-day" ,coalesce("26-day",0) as "26-day" ,coalesce("27-day",0) as"27-day" ,
coalesce("28-day",0) as "28-day" ,coalesce("29-day",0) as "29-day" ,coalesce("30-day",0) as "30-day" ,
coalesce("31-day",0) as"31-day"
from crosstab(
'select hist.product_id,pr.name,o.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),sum(sold_stock)
from sales_order hist
left join product pr on pr.id = hist.product_id
left join organization o on o.id = hist.organization_id
where EXTRACT(MONTH FROM TO_TIMESTAMP(hist.sold_date/1000)) =5
and EXTRACT(YEAR FROM TO_TIMESTAMP(hist.sold_date/1000)) = 2017
group by hist.product_id,pr.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),o.name
order by o.name,pr.name',
'select d from generate_series(1,31) d')
as ("proId" int ,product_name text,organizationName text,
"1-day" float,"2-day" float,"3-day" float,"4-day" float,"5-day" float,"6-day" float
,"7-day" float,"8-day" float,"9-day" float,"10-day" float,"11-day" float,"12-day" float,"13-day" float,"14-day" float,"15-day" float,"16-day" float,"17-day" float
,"18-day" float,"19-day" float,"20-day" float,"21-day" float,"22-day" float,"23-day" float,"24-day" float,"25-day" float,"26-day" float,"27-day" float,"28-day" float,
"29-day" float,"30-day" float,"31-day" float);
Please note, use PostgreSQL Crosstab Query. I have used coalesce for handling null values(Crosstab Query to show "0" when there is null data to return).
Following query will help to find the same:
select o.name,
p.name,
sum(case when extract (day from to_timestamp(sold_date))=1 then sold_stock else 0 end)day1_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=2 then sold_stock else 0 end)day2_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=3 then sold_stock else 0 end)day3_sold_stock,
from sales_order so,
organization o,
product p
where so.organization_id=o.id
and so.product_id=p.id
group by o.name,
p.name;
I just provided logic to find for 3 days, you can implement the same for rest of the days.
basically first do basic joins on id, and then check if each date(after converting epoch to timestamp and then extract day).
You have a few options here but it is important to understand the limitations first.
The big limitation is that the planner needs to know the record size before the planning stage, so this has to be explicitly defined, not dynamically defined. There are various ways of getting around this. At the end of the day, you are probably going to have somethign like Bavesh's answer, but there are some tools that may help.
Secondly, you may want to aggregate by date in a simple query joining the three tables and then pivot.
For the second approach, you could:
You could do a simple query and then pull the data into Excel or similar and create a pivot table there. This is probably the easiest solution.
You could use the tablefunc extension to create the crosstab for you.
Then we get to the first problem which is that if you are always doing 30 days, then it is easy if tedious. But if you want to do every day for a month, you run into the row length problem. Here what you can do is create a dynamic query in a function (pl/pgsql) and return a refcursor. In this case the actual planning takes place in the function and the planner doesn't need to worry about it on the outer level. Then you call FETCH on the output.

SQL Server - Only Select Latest Date

RDBMS = Microsoft SQL Server
I work for a refrigeration company and we want to do a better job of tracking the cost bottles of refrigerant were bought at for each inventory location. I am trying to create a SQL Query that pulls this information but I am running into some issues. For each inventory location I want to display the last cost refrigerant was bought at for that inventory location.I want to see the latest date we have record of for this location purchasing a specific refrigerant. I have tried using the Max function unsuccessfully and the Row_Number function I have not been able to get work. Any help would be much appreciated.
See below the code sample I am trying to only get to display the Latest Date each inventory location purchased R-22 30 pound jug.
select
lctn_id as Location,
invntryitm_id as InventoryItemID,
invntryitm_nme as InventoryItemName,
prchseordrlst_dte_rqstd as DateRequested,
prchseordrlst_unt_cst as UnitCost
from
invntryitm
join
prchseordrlst on prchseordrlst.invntryitm_rn = invntryitm.invntryitm_rn
join
prchseordr on prchseordr.prchseordr_rn = prchseordrlst.prchseordr_rn
join
lctn on lctn.lctn_rn = prchseordr.lctn_rn
where
invntryitm.invntryitm_nme ='REFRIGERANT R-22 30#'
and lctn_obslte = 'N'
group by
lctn.lctn_id, invntryitm.invntryitm_id, invntryitm.invntryitm_nme,
prchseordrlst.prchseordrlst_unt_cst
order by
lctn_id
I think an analytic/windowing function would give you what you need:
with location_data as (
select
lctn_id as Location,
invntryitm_id as InventoryItemID,
invntryitm_nme as InventoryItemName,
prchseordrlst_dte_rqstd as DateRequested,
prchseordrlst_unt_cst as UnitCost,
max (prchseordrlst_dte_rqstd) over (partition by lctn_id) as max_date
from
invntryitm
JOIN prchseordrlst on prchseordrlst.invntryitm_rn = invntryitm.invntryitm_rn
JOIN prchseordr on prchseordr.prchseordr_rn = prchseordrlst.prchseordr_rn
JOIN lctn on lctn.lctn_rn = prchseordr.lctn_rn
where
invntryitm.invntryitm_nme ='REFRIGERANT R-22 30#' and
lctn_obslte = 'N'
)
select *
from location_data
where max_date = DateRequested
order by Location
Bear in mind that if there is a tie, two location_id records with the same date, then you will get both of them back. If this is an issue, then you probably want row_number() instead of max():
row_number() over (partition by lctn_id order by prchseordrlst_dte_rqstd desc) as rn
And then you would
where rn = 1
to get the first row
The reason I didn't list row_number() first is that max is O(n), and if your data has dates and times, it may be sufficient for what you need.

Crystal Reports: Record Selection Snafu

I had received some really good help earlier, and I appreciate it.
I have another record selection snafu.
I have a parameter that I need to set as the end date.
I need to pull the most recent state before the end date from a table titled state_change.
I need to exclude any records from the report who are not in the required states at that period in time.
state is set currently to be state_change.new_state
( {#grouping} = "Orders" and rec_date < {?endDate} and {#state} in [0,2,5] )
OR
( {#grouping} = "Stock" and rec_date < {?endDate} and {#state} in [1,2,3,5,7] )
If I could run a SQL query to pull this information, it would probably work, but I cannot figure out how to do it.
Essentially, I need #state to be:
Select max(new_state)
From state_change
where change_time < {?endDate}
but on a per item level.
Any help would be appreciated.
You'll probably need to use a command object with a parameter for your end date, or create a parameterized stored procedure. The command object will allow you to enter all the sql you need, like joining your results with the max newState value before the end date:
select itemID, new_state, rec_date, max_newState from
(select itemID, new_state, rec_date from table) t inner join
(Select itemID, max(new_state) as max_newState
From state_change
where change_time < {?endDate}
group by itemID) mx on t.itemid = mx.itemID and t.new_state = mx.max_newState
I can't tell if your orders and stock groupings are in the same table, so I'm not sure how you need to limit your sets by the correct state values.