Postgresql query for every day sold stock count

Postgresql query for every day sold stock count - sql

I have project on CRM which maintains product sales order for every organization.
I want to count everyday sold stock which I have managed to do by looping over by date but obviously it is a ridiculous method and taking more time and memory.
Please help me to find out it in single query. Is it possible?
Here is my database structure for your reference.
product : id (PK), name
organization : id (PK), name
sales_order : id (PK), product_id (FK), organization_id (FK), sold_stock, sold_date(epoch time)
Expected Output for selected month :
organization | product | day1_sold_stock | day2_sold_stock | ..... | day30_sold_stock
http://sqlfiddle.com/#!15/e1dc3/3

Create tablfunc :
CREATE EXTENSION IF NOT EXISTS tablefunc;
Query :
select "proId" as ProductId ,product_name as ProductName,organizationName as OrganizationName,
coalesce( "1-day",0) as "1-day" ,coalesce( "2-day",0) as "2-day" ,coalesce( "3-day",0) as "3-day" ,
coalesce( "4-day",0) as "4-day" ,coalesce( "5-day",0) as "5-day" ,coalesce( "6-day",0) as "6-day" ,
coalesce( "7-day",0) as "7-day" ,coalesce( "8-day",0) as "8-day" ,coalesce( "9-day",0) as "9-day" ,
coalesce("10-day",0) as "10-day" ,coalesce("11-day",0) as "11-day" ,coalesce("12-day",0) as "12-day" ,
coalesce("13-day",0) as "13-day" ,coalesce("14-day",0) as "14-day" ,coalesce("15-day",0) as"15-day" ,
coalesce("16-day",0) as "16-day" ,coalesce("17-day",0) as "17-day" ,coalesce("18-day",0) as "18-day" ,
coalesce("19-day",0) as "19-day" ,coalesce("20-day",0) as "20-day" ,coalesce("21-day",0) as"21-day" ,
coalesce("22-day",0) as "22-day" ,coalesce("23-day",0) as "23-day" ,coalesce("24-day",0) as "24-day" ,
coalesce("25-day",0) as "25-day" ,coalesce("26-day",0) as "26-day" ,coalesce("27-day",0) as"27-day" ,
coalesce("28-day",0) as "28-day" ,coalesce("29-day",0) as "29-day" ,coalesce("30-day",0) as "30-day" ,
coalesce("31-day",0) as"31-day"
from crosstab(
'select hist.product_id,pr.name,o.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),sum(sold_stock)
from sales_order hist
left join product pr on pr.id = hist.product_id
left join organization o on o.id = hist.organization_id
where EXTRACT(MONTH FROM TO_TIMESTAMP(hist.sold_date/1000)) =5
and EXTRACT(YEAR FROM TO_TIMESTAMP(hist.sold_date/1000)) = 2017
group by hist.product_id,pr.name,EXTRACT(day FROM TO_TIMESTAMP(hist.sold_date/1000)),o.name
order by o.name,pr.name',
'select d from generate_series(1,31) d')
as ("proId" int ,product_name text,organizationName text,
"1-day" float,"2-day" float,"3-day" float,"4-day" float,"5-day" float,"6-day" float
,"7-day" float,"8-day" float,"9-day" float,"10-day" float,"11-day" float,"12-day" float,"13-day" float,"14-day" float,"15-day" float,"16-day" float,"17-day" float
,"18-day" float,"19-day" float,"20-day" float,"21-day" float,"22-day" float,"23-day" float,"24-day" float,"25-day" float,"26-day" float,"27-day" float,"28-day" float,
"29-day" float,"30-day" float,"31-day" float);
Please note, use PostgreSQL Crosstab Query. I have used coalesce for handling null values(Crosstab Query to show "0" when there is null data to return).

Following query will help to find the same:
select o.name,
p.name,
sum(case when extract (day from to_timestamp(sold_date))=1 then sold_stock else 0 end)day1_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=2 then sold_stock else 0 end)day2_sold_stock,
sum(case when extract (day from to_timestamp(sold_date))=3 then sold_stock else 0 end)day3_sold_stock,
from sales_order so,
organization o,
product p
where so.organization_id=o.id
and so.product_id=p.id
group by o.name,
p.name;
I just provided logic to find for 3 days, you can implement the same for rest of the days.
basically first do basic joins on id, and then check if each date(after converting epoch to timestamp and then extract day).

You have a few options here but it is important to understand the limitations first.
The big limitation is that the planner needs to know the record size before the planning stage, so this has to be explicitly defined, not dynamically defined. There are various ways of getting around this. At the end of the day, you are probably going to have somethign like Bavesh's answer, but there are some tools that may help.
Secondly, you may want to aggregate by date in a simple query joining the three tables and then pivot.
For the second approach, you could:
You could do a simple query and then pull the data into Excel or similar and create a pivot table there. This is probably the easiest solution.
You could use the tablefunc extension to create the crosstab for you.
Then we get to the first problem which is that if you are always doing 30 days, then it is easy if tedious. But if you want to do every day for a month, you run into the row length problem. Here what you can do is create a dynamic query in a function (pl/pgsql) and return a refcursor. In this case the actual planning takes place in the function and the planner doesn't need to worry about it on the outer level. Then you call FETCH on the output.

Related

Select only the row with the max value, but the column with this info is a SUM()

I have the following query:
SELECT DISTINCT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT
FROM TGFCAB CAB, TGFPAR PAR, TSIBAI BAI
WHERE CAB.CODPARC = PAR.CODPARC
AND PAR.CODBAI = BAI.CODBAI
AND CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI
Which the result is this. Company names and neighborhood hid for obvious reasons
The query at the moment, for those who don't understand Latin languages, is giving me clients, company name, company neighborhood, and the total value of movements.
in the WHERE clause it is only filtering sales movements of companies from an established city.
But if you notice in the Select statement, the column that is retuning the value that aggregates the total amount of value of sales is a SUM().
My goal is to return only the company that have the maximum value of this column, if its a tie, display both of em.
This is where i'm struggling, cause i can't seem to find a simple solution. I tried to use
WHERE AMOUNT = MAX(AMOUNT)
But as expected it didn't work

You tagged the question with the whole bunch of different databases; do you really use all of them?
Because, "PL/SQL" reads as "Oracle". If that's so, here's one option.
with temp as
-- this is your current query
(select columns,
sum(vrlnota) as amount
from ...
where ...
)
-- query that returns what you asked for
select *
from temp t
where t.amount = (select max(a.amount)
from temp a
);

You should be able to achieve the same without the need for a subquery using window over() function,
WITH T AS (
SELECT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT,
MAX(VLRNOTA) over() AS MAMOUNT
FROM TGFCAB CAB
JOIN TGFPAR PAR ON PAR.CODPARC = CAB.CODPARC
JOIN TSIBAI BAI ON BAI.CODBAI = PAR.CODBAI
WHERE CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY CAB.CODPARC, PAR.RAZAOSOCIAL, BAI.NOMEBAI
)
SELECT CODPARC, RAZAOSOCIAL, NOMEBAI, AMOUNT
FROM T
WHERE AMOUNT=MAMOUNT
Note it's usually (always) beneficial to join tables using clear explicit join syntax. This should be fine cross-platform between Oracle & SQL Server.

Getting a query result taken from the same data but with temporary var

I got a simple thing to do.
Well, maybe not, but someone somewhere surely can help me out : P
I got a simple data structure that contains
expedition date
delivery date
transaction type
I would need to create a query which could
order the rows by a date specific to the transaction type.
(ie : using the expedition date for transaction of type "selling", and delivery date for transaction of type "purchasing")
I was wondering if there was a more efficient way to do this than
by fetching 2 times the same data with different clause where(while adding a column used to order them(tempDate)) and then using another select to encompass these 2 queries to which I would add the order clause on the tempDate.
--> the initial fetching I would do 2 times works on many tables(many, many, many joins)
Basically my current solution is :
Select * from
(
Select ...
date_exp as dateTemp;
from ...
where conditions* And dateRelatedCondition
UNION
Select ...
date_livraison as dateTemp;
from ...
Where conditions* And NOT(dateRelatedCondition)
) as comboSelect
Order By MIN(comboSelect.dateTemp)
OVER(PARTITION BY(REF_product)),
(REF_product),
comboSelect.dateTemp asc;
*
->Those conditions are the same in both inner Select query
Thank you for your time.

Without the UNION:
dateRelatedCondition should be removed from WHERE and put to the SELECT like:
CASE WHEN dateRelatedCondition THEN date_exp ELSE date_livraison END as dateTemp
Without the subquery:
in ORDER BY you need the same expression in the window function:
Order By MIN(CASE WHEN dateRelatedCondition THEN date_exp ELSE date_livraison END)
OVER(PARTITION BY(REF_product)),
(REF_product),
dateTemp asc

You mean like this?:
ORDER BY CASE
WHEN TransactionType = 'Selling' THEN ExpeditionDate
WHEN TransactionType = 'purchasing' THEN DeliveryDate
END

Calculated column syntax when using a group by function Teradata

I'm trying to include a column calculated as a % of OTYPE.
IE
Order type | Status | volume of orders at each status | % of all orders at this status
SELECT
T.OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
(STATVOL / COUNT(ROW_ID)) * 100
FROM Database.S_ORDER O
LEFT JOIN /* Finding definitions for status codes & attaching */
(
SELECT
ROW_ID AS TYPEJOIN,
"NAME" AS OTYPE
FROM database.S_ORDER_TYPE
) T
ON T.TYPEJOIN = ORDER_TYPE_ID
GROUP BY (T.OTYPE, STATUS_CD)
/*Excludes pending and pending online orders */
WHERE CAST(CREATED AS DATE) = '2018/09/21' AND STATUS_CD <> 'Pending'
AND STATUS_CD <> 'Pending-Online'
ORDER BY T.OTYPE, STATUS_CD DESC
OTYPE STATUS_CD STATVOL TOTALPERC
Add New Service Provisioning 2,740 100
Add New Service In-transit 13 100
Add New Service Error - Provisioning 568 100
Add New Service Error - Integration 1 100
Add New Service Complete 14,387 100
Current output just puts 100 at every line, need it to be a % of total orders
Could anyone help out a Teradata & SQL student?
The complication making this difficult is my understanding of the group by and count syntax is tenuous. It took some fiddling to get it displayed as I have it, I'm not sure how to introduce a calculated column within this combo.
Thanks in advance

There are a couple of places the total could be done, but this is the way I would do it. I also cleaned up your other sub query which was not required, and changed the date to a non-ambiguous format (change it back if it cases an issue in Teradata)
SELECT
T."NAME" as OTYPE,
STATUS_CD,
COUNT(STATUS_CD) AS STATVOL,
COUNT(STATUS_CD)*100/TotalVol as Pct
FROM database.S_ORDER O
LEFT JOIN EDWPRDR_VW40_SBLCPY.S_ORDER_TYPE T on T.ROW_ID = ORDER_TYPE_ID
cross join (select count(*) as TotalVol from database.S_ORDER) Tot
GROUP BY T."NAME", STATUS_CD, TotalVol
WHERE CAST(CREATED AS DATE) = '2018-09-21' AND STATUS_CD <> 'Pending' AND STATUS_CD <> 'Pending-Online'
ORDER BY T."NAME", STATUS_CD DESC

A where clause comes before a group by clause, so the query
shown in the question isn't valid.
Always prefix every column reference with the relevant table alias, below I have assumed that where you did not use the alias that it belongs to the orders table.
You probably do not need a subquery for this left join. While there are times when a subquery is needed or good for performance, this does not appear to be the case here.
Most modern SQL compliant databases provide "window functions", and Teradata does do this. They are extremely useful, and here when you combine count() with an over clause you can get the total of all rows without needing another subquery or join.
Because there is neither sample data nor expected result provided with the question I do not actually know which numbers you really need for your percentage calculation. Instead I have opted to show you different ways to count so that you can choose the right ones. I suspect you are getting 100 for each row because the count(status_cd) is equal to the count(row_id). You need to count status_cd differently to how you count row_id. nb: The count() function increases by 1 for every non-null value
I changed the way your date filter is applied. It is not efficient to change data on every row to suit constants in a where clause. Leave the data untouched and alter the way you apply the filter to suit the data, this is almost always more efficient (search sargable)
SELECT
t.OTYPE
, o.STATUS_CD
, COUNT(o.STATUS_CD) count_status
, COUNT(t.ROW_ID count_row_id
, count(t.row_id) over() count_row_id_over
FROM dbo.S_ORDER o
LEFT JOIN dbo.S_ORDER_TYPE t ON t.TYPEJOIN = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
WHERE o.CREATED >= '2018-09-21' AND o.CREATED < '2018-09-22'
AND o.STATUS_CD <> 'Pending'
AND o.STATUS_CD <> 'Pending-Online'
GROUP BY
t.OTYPE
, o.STATUS_CD
ORDER BY
t.OTYPE
, o.STATUS_CD DESC

As #TomC already noted, there's no need for the join to a Derived Table. The simplest way to get the percentage is based on a Group Sum. I also changed the date to an Standard SQL Date Literal and moved the where before group by.
SELECT
t."NAME",
o.STATUS_CD,
Count(o.STATUS_CD) AS STATVOL,
-- rule of thumb: multiply first then divide, otherwise you will get unexpected results
-- (Teradata rounds after each calculation)
100.00 * STATVOL / Sum(STATVOL) Over ()
FROM database.S_ORDER AS O
/* Finding definitions for status codes & attaching */
LEFT JOIN database.S_ORDER_TYPE AS t
ON t.ROW_ID = o.ORDER_TYPE_ID
/*Excludes pending and pending online orders */
-- if o.CREATED is a Timestamp there's no need to apply the CAST
WHERE Cast(o.CREATED AS DATE) = DATE '2018-09-21'
AND o.STATUS_CD NOT IN ('Pending', 'Pending-Online')
GROUP BY (T.OTYPE, o.STATUS_CD)
ORDER BY T.OTYPE, o.STATUS_CD DESC
Btw, you probably don't need an Outer Join, Inner should return the same result.

Most recent transaction date against a Works Order?

Apologies in advance for what will probably be a very stupid question but I've been using Google to teach myself SQL after making the move from years of using Crystal Reports.
We have Works Orders which can have numerous transactions against them. I want to find the most recent one and have it returned against the Works Order number (which is a unique ID)? I attempted to use MAX but that just returns whatever the Transaction Date for that record is.
I think my struggles may be caused by a lack of understanding of grouping in SQL. In Crystal it was just 'choose what to group by' but for some reason in SQL I seem to be forced to group by all selected fields.
My ultimate goal is to be able to compare the planned end date of the Works Order ("we need to finish this job by then") vs when the last transaction was booked against the Works Order, so that I can create an OTIF KPI.
I've attached an image of what I'm currently seeing in SQL Server 2014 Management Studio and below is my attempt at the query.
SELECT wip.WO.WO_No
, wip.WO.WO_Type
, stock.Stock_Trans_Log.Part_No
, stock.Stock_Trans_Types.Description
, stock.Stock_Trans_Log.Qty_Change
, stock.Stock_Trans_Log.Trans_Date
, wip.WO.End_Date
, wip.WO.Qty - wip.WO.Qty_Stored AS 'Qty remaining'
, MAX(stock.Stock_Trans_Log.Trans_Date) AS 'Last Production Receipt'
FROM stock.Stock_Trans_Log
INNER JOIN production.Part
ON stock.Stock_Trans_Log.Part_No = production.Part.Part_No
INNER JOIN wip.WO
ON stock.Stock_Trans_Log.WO_No = wip.WO.WO_No
INNER JOIN stock.Stock_Trans_Types
ON stock.Stock_Trans_Log.Tran_Type = stock.Stock_Trans_Types.Type
WHERE (stock.Stock_Trans_Types.Type = 10)
AND (stock.Stock_Trans_Log.Store_Code <> 'BI')
GROUP BY wip.WO.WO_No
, wip.WO.WO_Type
, stock.Stock_Trans_Log.Part_No
, stock.Stock_Trans_Types.Description
, stock.Stock_Trans_Log.Qty_Change
, stock.Stock_Trans_Log.Trans_Date
, wip.WO.End_Date
, wip.WO.Qty - wip.WO.Qty_Stored
HAVING (stock.Stock_Trans_Log.Part_No BETWEEN N'2Z' AND N'9A')
Query + results

If my paraphrase is correct, you could use something along the following lines...
WITH
sequenced_filtered_stock_trans_log AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY WO_No
ORDER BY Trans_Date DESC) AS reversed_sequence_id
FROM
stock.Stock_Trans_Log
WHERE
Type = 10
AND Store_Code <> 'BI'
AND Part_No BETWEEN N'2Z' AND N'9A'
)
SELECT
<stuff>
FROM
sequenced_filtered_stock_trans_log AS stock_trans_log
INNER JOIN
<your joins>
WHERE
stock_trans_log.reversed_sequence_id = 1
First, this will apply the WHERE clause to filter the log table.
After the WHERE clause is applied, a sequence id is calculated. Restarting from one for each partition (each WO_No), and starting from the highest Trans_Date.
Finally, that can be used in your outer query with a WHERE clause that specifies that you only want the records with sequence id one, this it the most recent row per WO_No. The rest of the joins on to that table would proceed as normal.
If there is any other filtering that should be done (through joins or any other means) that should all be done before the application of the ROW_NUMBER().

SubQuery Aggregates in ActiveRecord

I'm trying to avoid using straight up SQL in my Rails app, but need to do a quite large version of this:
SELECT ds.product_id,
( SELECT SUM(units) FROM daily_sales WHERE (date BETWEEN '2015-01-01' AND '2015-01-08') AND service_type = 1 ) as wk1,
( SELECT SUM(units) FROM daily_sales WHERE (date BETWEEN '2015-01-09' AND '2015-01-16') AND service_type = 1 ) as wk2
FROM daily_sales as ds group by ds.product_id
I'm sure it can be done, but i'm struggling to write this as an active record statement. Can anyone help?

If you must do this in a single query, you'll need to write some SQL for the CASE statements. The following is what you need:
ranges = [ # ordered array of all your date-ranges
Date.new(2015, 1, 1)..Date.new(2015, 1, 8),
Date.new(2015, 1, 9)..Date.new(2015, 1, 16)
]
overall_range = (ranges.first.min)..(ranges.last.max)
grouping_sub_str = \
ranges.map.with_index do |range, i|
"WHEN (date BETWEEN '#{range.min}' AND '#{range.max}') THEN 'week#{i}'"
end.join(' ')
grouping_condition = "CASE #{grouping_sub_str} END"
grouping_columns = ['product_id', grouping_condition]
DailySale.where(date: overall_range).group(grouping_columns).sum(:units)
That will produce a hash with array keys and numeric values. A key will be of the form [product_id, 'week1'] and the value will be the corresponding sum of units for that week.

Simplify your SQL to the following and try converting it..
SELECT ds.product_id,
, SUM(CASE WHEN date BETWEEN '2015-01-01' AND '2015-01-08' AND service_type = 1
THEN units
END) WK1
, SUM(CASE WHEN date BETWEEN '2015-01-09' AND '2015-01-16' AND service_type = 1
THEN units
END) WK2
FROM daily_sales as ds
group by ds.product_id

Every rail developer sooner or later hits his/her head against the walls of Active Record query interface just to find the solution in Arel.
Arel gives you the flexibility that you need in creating your query without using loops, etc. I am not going to give runnable code rather some hints how to do it yourself:
We are going to use arel_tables to create our query. For a model called for example Product, getting the Arel table is as easy as products = Product.arel_table
Getting sum of a column is like daily_sales.project(daily_sales[:units].count).where(daily_sales[:date].gt(BEGIN_DATE).where(daily_sales[:date].lt(END_DATE). You can chain as many wheres as you want and it will be translated into SQL ANDs.
Since we need to have multiple sums in our end result you need to make use of Common Table Expressions(CTE). Take a look at docs and this answer for more info on this.
You can use those CTEs from step 3 in combination with group and you are done!

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas