I'm trying to get a report built up from data mining our accounting software.
We have a table that stores the balances of each account in a general ledger for a given period (which is 0-12, 0 being carry over from last year, 1-12 being the corresponding month), the amount, and other data I don't need.
I'm trying unsuccessfully to get a value for each account for each month, however there isn't always a corresponding entry. I've tried left outer joins, cross joins, inner joins, and can't seem to get it to work how I want. I've even tried doing left outer joins with a table containing 'Initial' as item 0 and 12 other entries, one name for each month.
Here's a sample of the data:
GLBalances table:
acct_no | post_prd | post_trn_amt
1011 | 0 | -15000
1011 | 1 | 5000
1011 | 2 | -6000
1011 | 4 | 8000
1020 | 5 | 100
1020 | 12 | 300
1011 | 9 | 500
1011 | 8 | 0
etc...
What I'd like to get out is:
acct_no | post_prd | post_trn_amt
1011 | 0 | -15000
1011 | 1 | 5000
1011 | 2 | -6000
1011 | 3 | 0
1011 | 4 | 8000
1011 | 5 | 0
1011 | 6 | 0
1011 | 7 | 0
1011 | 8 | 0
1011 | 9 | 500
1011 | 10 | 0
1011 | 11 | 0
1011 | 12 | 0
1020 | 0 | 0
1020 | 1 | 0
1020 | 2 | 0
1020 | 3 | 0
1020 | 4 | 0
1020 | 5 | 100
1020 | 6 | 0
1020 | 7 | 0
1020 | 8 | 0
1020 | 9 | 0
1020 | 10 | 0
1020 | 11 | 0
1020 | 12 | 300
etc...
So basically 13 entries for each acct for a particular year even if there's no entry for that period.
I'm sure this is way easier than I'm making it, I'm just struggling since I don't deal with SQL on a daily basis. Any help would be much appreciated.
You can create a sheet of valid accounts and months with cross join. Look for the corresponding "real" row with a left join, and you're set:
;with months as
(
select 0 as Month
union all
select Month + 1 from months where Month < 12
)
select a.acct_no, m.month as post_prd, IsNull(g.post_trn_amt,0)
from months m
cross join (select distinct acct_no from #GLBalances) a
left join #GLBalances g
on m.month = g.post_prd
and a.acct_no = g.acct_no
order by a.acct_no, m.month
The "with months as" construct is a fancy way to create a table containing numbers 0 to 12. You can also create a real table containing those numbers, and do away with the "recursive common table expression" construct.
Here's the test data I used:
declare #GLBalances table (acct_no int, post_prd int, post_trn_amt int)
insert into #GLBalances
select 1011,0,-15000
union all select 1011, 1, 5000
union all select 1011, 2, -6000
union all select 1011, 4, 8000
union all select 1020, 5, 100
union all select 1020, 12, 300
union all select 1011, 9, 500
union all select 1011, 8, 0
Related
I am trying to summerize sales date, by month, sales region and type. The problem is, the results change when I try to group by year.
My simplified query is as follows:
SELECT
DAB700.DATUM,DAB000.X_REGION,DAB700.BELEG_ART, // the date, sales region, order type
// calculate the number of orders per month
COUNT (DISTINCT CASE WHEN MONTH(DAB700.DATUM) = 1 THEN DAB700.BELEG_NR END) as jan,
COUNT (DISTINCT CASE WHEN MONTH(DAB700.DATUM) = 2 THEN DAB700.BELEG_NR END) as feb,
COUNT (DISTINCT CASE WHEN MONTH(DAB700.DATUM) = 3 THEN DAB700.BELEG_NR END) as mar
FROM "DAB700.ADT" DAB700
left join "DAB050.ADT" DAB050 on DAB700.BELEG_NR = DAB050.ANUMMER // join to table 050, to pull in order info
left join "DF030000.DBF" DAB000 on DAB050.KDNR = DAB000.KDNR // join table 000 to table 050, to pull in customer info
left join "DAB055.ADT" DAB055 on DAB050.ANUMMER = left (DAB055.APNUMMER,6)// join table 055 to table 050, to pull in product info
WHERE (DAB700.BELEG_ART = 10 OR DAB700.BELEG_ART = 20) AND (DAB700.DATUM>={d '2021-01-01'}) AND (DAB700.DATUM<={d '2021-01-11'}) AND DAB055.ARTNR <> '999999' AND DAB055.ARTNR <> '999996' AND DAB055.TERMIN <> 'KW.22.22' AND DAB055.TERMIN <> 'KW.99.99' AND DAB050.AUF_ART = 0
group by DAB700.DATUM,DAB000.X_REGION,DAB700.BELEG_ART
This returns the following data, which is correct (manually checked):
| DATUM | X_REGION | BELEG_ART | jan | feb | mar |
|------------|----------|-----------|-----|-----|-----|
| 04.01.2021 | 1 | 10 | 3 | 0 | 0 |
| 04.01.2021 | 3 | 10 | 2 | 0 | 0 |
| 04.01.2021 | 4 | 10 | 1 | 0 | 0 |
| 04.01.2021 | 4 | 20 | 1 | 0 | 0 |
| 04.01.2021 | 6 | 20 | 2 | 0 | 0 |
| 05.01.2021 | 1 | 10 | 1 | 0 | 0 |
and so on....
The total number of records for Jan is 117 (correct).
Now I now want to summerize the data in one row (for example, data grouped by region and type)..
so I change my code so that I have:
SELECT
YEAR(DAB700.DATUM),
and
group by YEAR(DAB700.DATUM)
the rest of the code stays the same.
Now my results are:
| EXPR | X_REGION | BELEG_ART | jan | feb | mar |
|------|----------|-----------|-----|-----|-----|
| 2021 | 1 | 10 | 16 | 0 | 0 |
| 2021 | 1 | 20 | 16 | 0 | 0 |
| 2021 | 2 | 10 | 19 | 0 | 0 |
| 2021 | 2 | 20 | 22 | 0 | 0 |
| 2021 | 3 | 10 | 12 | 0 | 0 |
| 2021 | 3 | 20 | 6 | 0 | 0 |
Visually it is correct. But, the total count for January is now 116. A difference of 1. What am I doing wrong?
How can I keep the results from the first code - but have it presented as per the 2nd set?
You count distinct BELEG_NR. This is what makes the difference. Let's look at an example. Let's say your table contains four rows:
DATUM
X_REGION
BELEG_ART
BELEG_NR
04.01.2021
1
10
100
04.01.2021
1
10
200
05.01.2021
1
10
100
05.01.2021
1
10
300
That gives you per day, region and belegart:
DATUM
X_REGION
BELEG_ART
DISTINCT COUNT BELEG_NR
04.01.2021
1
10
2
05.01.2021
1
10
2
and per year, region and belegart
YEAR
X_REGION
BELEG_ART
DISTINCT COUNT BELEG_NR
2021
1
10
3
The BELEG_NR 100 never appears more than once per day, so every instance gets counted. But it appears twice for the year, so it gets counted once instead of twice.
I have a table of items in the shop, an item may have different entries with same serial number (sn) (but different ids) if the same item was bought again later on with different price (price here is how much did a single item cost the shop)
id | sn | amount | price
----+------+--------+-------
1 | AP01 | 100 | 7
2 | AP01 | 50 | 8
3 | X2P0 | 200 | 12
4 | X2P0 | 30 | 18
5 | STT0 | 20 | 20
6 | PLX1 | 200 | 10
and a table of transactions
id | item_id | price
----+---------+-------
1 | 1 | 10
2 | 1 | 9
3 | 1 | 10
4 | 2 | 11
5 | 3 | 15
6 | 3 | 15
7 | 3 | 15
8 | 4 | 18
9 | 5 | 22
10 | 5 | 22
11 | 5 | 22
12 | 5 | 22
and transaction.item_id references items(id)
I want to group items by serial number (sn), get their sum(amount) and avg(price), and join it with a sold column that counts number of transactions with referenced id
I did the first with
select i.sn, sum(i.amount), avg(i.price) from items i group by i.sn;
sn | sum | avg
------+-----+---------------------
STT0 | 20 | 20.0000000000000000
PLX1 | 200 | 10.0000000000000000
AP01 | 150 | 7.5000000000000000
X2P0 | 230 | 15.0000000000000000
Then when I tried to join it with transactions I got strange results
select i.sn, sum(i.amount), avg(i.price) avg_cost, count(t.item_id) sold, sum(t.price) profit from items i left join transactions t on (i.id=t.item_id) group by i.sn;
sn | sum | avg_cost | sold | profit
------+-----+---------------------+------+--------
STT0 | 80 | 20.0000000000000000 | 4 | 88
PLX1 | 200 | 10.0000000000000000 | 0 | (null)
AP01 | 350 | 7.2500000000000000 | 4 | 40
X2P0 | 630 | 13.5000000000000000 | 4 | 63
As you can see, only the sold and profit columns show correct results, the sum and avg show different results than the expected
I can't separate the statements because I am not sure how can I add the count to the sn group which has the item_id as its id?
select
j.sn,
j.sum,
j.avg,
count(item_id)
from (
select
i.sn,
sum(i.amount),
avg(i.price)
from items i
group by i.sn
) j
left join transactions t
on (j.id???=t.item_id);
There are multiple matches in both tables, so the join multiplies the rows (and eventually produces wron results). I would recommend pre-joining, then aggregating:
select
sn,
sum(amount) total_amount,
avg(price) avg_price,
sum(no_transactions) no_transactions
from (
select
i.*,
(
select count(*)
from transactions t
where t.item_id = i.id
) no_transactions
from items i
) t
group by sn
SQL Newb here, I'm having a bit of trouble understanding this problem. How can I write a single SELECT statement where I can have columns with their own WHERE clauses, do a calculation, and group the results.
I can write the query to sum totals and do averages checks grouping by revenue center and fiscal year, but I can't quite grasp how to do side by side compare with a single query.
SALES DATA
| RevenueCenter | FiscalYear | TotalSales | NumChecks |
|---------------|------------|------------|-----------|
| market | 2019 | 2000.00 | 10 |
| restaurant | 2019 | 5000.00 | 25 |
| restaurant | 2020 | 4000.00 | 20 |
| market | 2020 | 3000.00 | 10 |
COMPARE REPORT
| RevenueCenter | TotalSales2020 | TotalSales2019 | %Change | AvgCheck2020 | AvgCheck2019 | %Change |
| market | 3000.00 | 2000.00 | +50% | 300.00 | 200.00 | +50% |
| restaurant | 4000.00 | 5000.00 | -20% | 200.00 | 200.00 | 0% |
Would this help? No big deal, just a self-join with some arithmetic.
SQL> with sales (revenuecenter, fiscalyear, totalsales, numchecks) as
2 -- sample data
3 (select 'market' , 2019, 2000, 10 from dual union all
4 select 'market' , 2020, 3000, 10 from dual union all
5 select 'restaurant', 2019, 5000, 25 from dual union all
6 select 'restaurant', 2020, 4000, 20 from dual
7 )
8 -- query you need
9 select a.revenuecenter,
10 b.totalsales totalsales2020,
11 a.totalsales totalsales2019,
12 --
13 (b.totalsales/a.totalsales) * 100 - 100 "%change totalsal",
14 --
15 b.totalsales / b.numchecks avgcheck2020,
16 a.totalsales / a.numchecks avgcheck2019,
17 --
18 (b.totalsales / b.numchecks) /
19 (a.totalsales / a.numchecks) * 100 - 100 "%change numcheck"
20 from sales a join sales b on a.revenuecenter = b.revenuecenter
21 and a.fiscalyear < b.fiscalyear;
REVENUECEN TOTALSALES2020 TOTALSALES2019 %change totalsal AVGCHECK2020 AVGCHECK2019 %change numcheck
---------- -------------- -------------- ---------------- ------------ ------------ ----------------
market 3000 2000 50 300 200 50
restaurant 4000 5000 -20 200 200 0
SQL>
Hello there i have this example dataset:
employee_id | amount | cumulative_amount
-------------+------------+-----------------
2 | 100 | 100
6 | 220 | 320
7 | 45 | 365
8 | 50 | 415
9 | 110 | 525
16 | 300 | 825
17 | 250 | 1075
18 | 200 | 1275
And interval, let's say 300:
I'd like to pick only rows, that match the interval, with condition:
Pick value if it's >= previous value+interval
(e.g if start Val = 100, next matching row is where cumulative amount >= 400, and so on)
:
employee_id | amount | cumulative_amount
-------------+------------+-----------------
2 | 100 | 100 <-- $Start
6 | 220 | 320 - 400
7 | 45 | 365 - 400
8 | 50 | 415 <-- 1
9 | 110 | 525 - 715 (prev value (415)+300)
16 | 300 | 825 <-- 2
17 | 250 | 1075 - 1125 (825+300)
18 | 200 | 1275 <-- 3
so final result would be :
employee_id | amount | cumulative_amount
-------------+------------+-----------------
2 | 100 | 100
8 | 50 | 415
16 | 300 | 825
18 | 200 | 1275
How to achieve this in PostgreSQL in most efficient way ?
Column cumulative_amount is progressive sum of column amount
and it's calculated in another query, which result is dataset above, table is ordered by employee_id.
Regards.
not saying it is the most effective way, but probably the easiest:
s=# create table s1(a int, b int, c int);
CREATE TABLE
Time: 10.262 ms
s=# copy s1 from stdin delimiter '|';
...
s=# with g as (select generate_series(100,1300,300) s)
, o as (select *,sum(b) over (order by a) from s1)
, c as (select *, min(sum) over (partition by g.s)
from o
join g on sum >= g.s and sum < g.s + 300
)
select a,b,sum from c
where sum = min
;
a | b | sum
----+-----+------
2 | 100 | 100
8 | 50 | 415
16 | 300 | 825
17 | 250 | 1075
(4 rows)
here I used order by a as you sad your cumulative sum is by first column (which reconciled with third row)
i am currently making a monthly report using MySQL. I have a table named "monthly" that looks something like this:
id | date | amount
10 | 2009-12-01 22:10:08 | 7
9 | 2009-11-01 22:10:08 | 78
8 | 2009-10-01 23:10:08 | 5
7 | 2009-07-01 21:10:08 | 54
6 | 2009-03-01 04:10:08 | 3
5 | 2009-02-01 09:10:08 | 456
4 | 2009-02-01 14:10:08 | 4
3 | 2009-01-01 20:10:08 | 20
2 | 2009-01-01 13:10:15 | 10
1 | 2008-12-01 10:10:10 | 5
Then, when i make a monthly report (which is based by per month of per year), i get something like this.
yearmonth | total
2008-12 | 5
2009-01 | 30
2009-02 | 460
2009-03 | 3
2009-07 | 54
2009-10 | 5
2009-11 | 78
2009-12 | 7
I used this query to achieved the result:
SELECT substring( date, 1, 7 ) AS yearmonth, sum( amount ) AS total
FROM monthly
GROUP BY substring( date, 1, 7 )
But I need something like this:
yearmonth | total
2008-01 | 0
2008-02 | 0
2008-03 | 0
2008-04 | 0
2008-05 | 0
2008-06 | 0
2008-07 | 0
2008-08 | 0
2008-09 | 0
2008-10 | 0
2008-11 | 0
2008-12 | 5
2009-01 | 30
2009-02 | 460
2009-03 | 3
2009-05 | 0
2009-06 | 0
2009-07 | 54
2009-08 | 0
2009-09 | 0
2009-10 | 5
2009-11 | 78
2009-12 | 7
Something that would display the zeroes for the month that doesnt have any value. Is it even possible to do that in a MySQL query?
You should generate a dummy rowsource and LEFT JOIN with it:
SELECT *
FROM (
SELECT 1 AS month
UNION ALL
SELECT 2
…
UNION ALL
SELECT 12
) months
CROSS JOIN
(
SELECT 2008 AS year
UNION ALL
SELECT 2009 AS year
) years
LEFT JOIN
mydata m
ON m.date >= CONCAT_WS('.', year, month, 1)
AND m.date < CONCAT_WS('.', year, month, 1) + INTERVAL 1 MONTH
GROUP BY
year, month
You can create these as tables on disk rather than generate them each time.
MySQL is the only system of the major four that does have allow an easy way to generate arbitrary resultsets.
Oracle, SQL Server and PostgreSQL do have those (CONNECT BY, recursive CTE's and generate_series, respectively)
Quassnoi is right, and I'll add a comment about how to recognize when you need something like this:
You want '2008-01' in your result, yet nothing in the source table has a date in January, 2008. Result sets have to come from the tables you query, so the obvious conclusion is that you need an additional table - one that contains each month you want as part of your result.