Querying and adding rows - sql

OK, this is a second attempt to resolve my issue, for those who will read this a second time, i hope its clear enough to understand a problem.
I am developing a query for a report, the thing is that while retrieving data from database this report should populate some rows, which do not exist. For illustrating purpose lets say i have these tables :
Table 1 - Companies
Table 2 - Transactions.
Table 3 - Transaction types.
Important detail that most of the companies do not have transactions of all transaction types. Although the report logic requires to dysplay a company with all of them : "real" ones with real money values and other, not existed ones with just $0. The problem starts here because transaction types are combined in logical groups, so lets say if a company has only 1 real transaction of type_1, the report should contain "$0" records of other types associated with type_1, like type_2, type_3 and type_4. If company has transactions of type_1 and type_2, report should be populated with some other tran types from different transaction type group etc.
The problem here is that the environment where it should be executed must be a pure sql (being a java programmer i understand how easy is to query database, load data into array[][] and add missing transaction types) - but the query should be ran on UNIX inside plsql batch so it should be single (or joined) select.
Thanks in advance. Any help or ideas would be very appreciated!

It sounds like you just need some sort of outer join. I'm guessing at how your tables relate to each other but it appears that you want something like
SELECT c_typ_cross_join.company_name,
c_typ_cross_join.transaction_type,
nvl( sum( t.transaction_amount ), 0 ) total_amt
FROM (SELECT c.company_name,
typ.transaction_type
FROM companies c
FULL OUTER JOIN transaction_type typ) c_typ_cross_join
LEFT OUTER JOIN transactions t ON ( c_typ_cross_join.company_id = t.company_id
AND c_typ_cross_join.transaction_type = t.transaction_typ)
GROUP BY c_typ_cross_join.company_name,
c_typ_cross_join.transaction_type
This should produce one row for every company for every transaction type and the sum of the related transactions (or 0 if there are no transactions for the combination of companies and transaction types).

You could use two sub-queries one to find all transactions per company based on the existing types the company has, second to find the totals.
SELECT companies.id, all_transactions.transaction, COALESCE(sums.total_amount, 0)
FROM companies
JOIN (SELECT ct.companyid, t.transaction
FROM transactions ct
JOIN transactions t ON t.transactiontype = ct.transactiontype
GROUP BY ct.companyid, t.transaction) all_transactions ON all_transactions.companyid = companies.companyid
LEFT JOIN (SELECT ct.companyid, SUM(t.amount) as total_amount
FROM transactions ct
GROUP BY ct.companyid) sums ON sums.companyid = companies.companyid

Related

SQLite selecting transactions that do / do not meet a particular criteria

I am trying to extract data from a GnuCash SQLite database. Relevant tables include accounts, transactions, and splits. Simplistically, accounts contain transactions which contain splits, and each split points back to an account.
The transactions need to be processed differently depending on whether each one does or does not include a particular kind of transaction fee—in this case whether or not the transaction contains a split linked to account 8190-000.
I've set up two queries, one that handles transactions with the transaction fee, and one that handles transactions without the transaction fee. The queries work, but they are awkward and wordy, and I'm sure there is a better way to do this. I did see not exists in this answer, but could not figure out how to make it work in this situation.
My current queries look like this:
-- Find all transactions containing a split with account code 8190-000
select tx_guid from transactions
inner join
(select tx_guid from
(splits inner join accounts on splits.account_guid = accounts.guid)
where accounts.code = "8190-000") fee_transactions
on fee_transactions.tx_guid = transactions.guid;
-- Find all transactions not containing a split with account code 8190-000
select guid from transactions
except
select tx_guid from transactions
inner join
(select tx_guid from
(splits inner join accounts on splits.account_guid = accounts.guid)
where accounts.code = "8190-000") fee_transactions
on fee_transactions.tx_guid = transactions.guid;
Given that I need to use these results in other queries, what is a simpler and more succinct way to obtain these lists of transactions?
You can use EXISTS for your 1st query like this:
SELECT t.*
FROM transactions t
WHERE EXISTS (
SELECT 1
FROM splits s INNER JOIN accounts a
ON s.account_guid = a.guid
WHERE a.code = '8190-000' AND ?.tx_guid = t.guid
);
Change ? to s or a, depending on which table contains the column tx_guid (splits or accounts), since it is not clear in your question.
Also, change to NOT EXISTS for your 2nd query.

Trouble writing recursive query

I apologize for how potentially cluttered the explanation to my problem might be. I've
included details so that things make as much sense as possible leading up to the main
obstacle I've come across.
I'm working within Teradata using two tables that look like the following
Table Name Fields
Sales (ID, Sales)
Discounts (ID, PromoNum, Discount)
The PromoNum field consists of 9 digit unique promotion numbers which correspond to coupons.
This helps track whenever a transaction includes a specific coupon that was used. Each
transaction can have more than 1 coupon applied.
I'm trying to create a recursive query which pulls sales and discounts for a given set of coupons
in an iterative manner. The reason I'm doing so iteratively is because it is possible that a
single transaction can have more than 1 coupon applied (for 1 or more items). If I was avoid the
recursive query route and do an inner join on ID for example, it is possible that I could duplicate
records unnecessarily where two or more promo numbers were used within the same transaction, resulting
in potentially greater sales or discounts than actual. On top of this, I only have read access
to the database.
I've created a temp table called Promos with 3 specific promotions that I want to run interatively
and has the fields PromoNum and PromoIndex. PromoIndex is essentially the row number for each
promotion which I attempt to utilize in an interative manner below.
The recursive query I've writtens so far is as below. It doesn't work as expected due to the logic
behind the line I've commented. I need to rewrite this portion to make sure it simply runs for
the promotion number corresponding to the index at that specific iteration. For instance, when it
is at iteration 2, it will technically join on PromoIndex 1 and PromoIndex 2 when it should only run
for PromoIndex 2 if that makes sense. I've attempted to rewrite it while remaining within what's
allowed in a recursive query and I can't figure it out.
WITH RECURSIVE PromoData AS
(
SELECT
1 AS PromoIndex
, 1 AS PromoNum --dummy column
, 0 AS Sales --dummy column
, 0 AS Discounts --dummy column
FROM
Dummy Table
UNION ALL
SELECT
PromoData.PromoIndex + 1
, PromoData.PromoNum
, Sales.Sales
, Discounts.Discounts --Edited here
FROM Sales
INNER JOIN Discounts on Sales.ID = Discounts.ID
INNER JOIN Promos on Promos.PromoNum = Discounts.PromoNum and Promos.PromoIndex = PromoData.PromoIndex --Problematic portion here
WHERE PromoData.PromoIndex <= 3
)
SELECT *
FROM PromoData
From what you describe, you want:
select s.*
from sales s
where exists (select 1
from discounts d join
promos p
on d.promonum = p.promonum
where d.id = s.id
);
I don't see what a recursive query has to do with the problem you have described.
Recursive queries are normally used to resolve multiple layers of hierarchical rows, like those with a parent / child relationship. I don't think that is needed in this case.
The main issue I see here is you're trying to relate sales and discounts, but I don't see a natural way to do that. For example, if a transaction has $100 of sales and two discounts of $10 and $20 how much of the $100 gets attributed to each discount? I think this is what you meant by "two or more promo numbers being used within the same transaction" causing inflated figures.
Assuming your ID field is used as a transaction_ID, you can try something like:
WITH coupons AS (
SELECT 'PromoID1' AS PromoNum UNION ALL
SELECT 'PromoID2' AS PromoNum UNION ALL
SELECT 'PromoID3' AS PromoNum
)
SELECT
c.PromoNum,
COALESCE(info.sales, 0) sales,
COALESCE(info.discounts, 0) discounts
FROM coupons c -- get all specified coupons
LEFT JOIN (
SELECT
MAX(s.sales) sales,
SUM(d.discount) discounts, -- Get total discount for txn
MAX(d.PromoNum) AS PromoNum -- Pick a single PromoNum
FROM sales s -- Get all sales
LEFT JOIN discounts d ON s.ID = d.ID -- Get any discounts applied to sales
GROUP BY s.ID -- One row per txn (avoid double counting sales)
) info ON c.PromoNum = info.PromoNum -- Get related sales / discounts per PromoNum
The difference here is that in the case of a transaction with multiple discounts, all of the sales for that transaction will only be associated with a single PromoNum. This way you won't get inflated sales numbers.
Not sure if that's what you're after, but hope that helps.

PL SQL CASE WHEN statement

I am looking to find a way to write a CASE WHEN statement that returns a value if ANY of the items = a specific criteria. Here is a little background:
I have transactions that have several different assets on each transaction and those assets could all have different suppliers associated with them. I am aggregating the data to a transaction level and I am looking to flag the transactions where ANY asset on the transaction has a supplier that is the same as the customer on that transaction.
So if any supplier from all the assets on transaction X equals the customer on transaction X flag this deal
You can use exists. Without sample data and desired results, it is a little tricky to form the entire query, but something like this:
select t.*,
(case when exists (select 1 from suppliers s where t.customer = s.supplier)
then 1 else 0
end) as flag
from transactions t;

SQL Counting and Joining

I'm taking a database course this semester, and we're learning SQL. I understand most simple queries, but I'm having some difficulty using the count aggregate function.
I'm supposed to relate an advertisement number to a property number to a branch number so that I can tally up the amount of advertisements by branch number and compute their cost. I set up what I think are two appropriate new views, but I'm clueless as to what to write for the select statement. Am I approaching this the correct way? I have a feeling I'm over complicating this bigtime...
with ad_prop(ad_no, property_no, overseen_by) as
(select a.ad_no, a.property_no, p.overseen_by
from advertisement as a, property as p
where a.property_no = p.property_no)
with prop_branch(property_no, overseen_by, allocated_to) as
(select p.property_no, p.overseen_by, s.allocated_to
from property as p, staff as s
where p.overseen_by = s.staff_no)
select distinct pb.allocated_to as branch_no, count( ??? ) * 100 as ad_cost
from prop_branch as pb, ad_prop as ap
where ap.property_no = pb.property_no
group by branch_no;
Any insight would be greatly appreciated!
You could simplify it like this:
advertisement
- ad_no
- property_no
property
- property_no
- overseen_by
staff
- staff_no
- allocated_to
SELECT s.allocated_to AS branch, COUNT(*) as num_ads, COUNT(*)*100 as ad_cost
FROM advertisement AS a
INNER JOIN property AS p ON a.property_no = p.property_no
INNER JOIN staff AS s ON p.overseen_by = s.staff_no
GROUP BY s.allocated_to;
Update: changed above to match your schema needs
You can condense your WITH clauses into a single statement. Then, the piece I think you are missing is that columns referenced in the column definition have to be aggregated if they aren't included in the GROUP BY clause. So you GROUP BY your distinct column then apply your aggregation and math in your column definitions.
SELECT
s.allocated_to AS branch_no
,COUNT(a.ad_no) AS ad_count
,(ad_count * 100) AS ad_cost
...
GROUP BY s.allocated_to
i can tell you that you are making it way too complicated. It should be a select statement with a couple of joins. You should re-read the chapter on joins or take a look at the following link
http://www.sql-tutorial.net/SQL-JOIN.asp
A join allows you to "combine" the data from two tables based on a common key between the two tables (you can chain more tables together with more joins). Once you have this "joined" table, you can pretend that it is really one table (aliases are used to indicate where that column came from). You understand how aggregates work on a single table right?
I'd prefer not to give you the answer so that you can actually learn :)

Uses of unequal joins

Of all the thousands of queries I've written, I can probably count on one hand the number of times I've used a non-equijoin. e.g.:
SELECT * FROM tbl1 INNER JOIN tbl2 ON tbl1.date > tbl2.date
And most of those instances were probably better solved using another method. Are there any good/clever real-world uses for non-equijoins that you've come across?
Bitmasks come to mind. In one of my jobs, we had permissions for a particular user or group on an "object" (usually corresponding to a form or class in the code) stored in the database. Rather than including a row or column for each particular permission (read, write, read others, write others, etc.), we would typically assign a bit value to each one. From there, we could then join using bitwise operators to get objects with a particular permission.
How about for checking for overlaps?
select ...
from employee_assignments ea1
, employee_assignments ea2
where ea1.emp_id = ea2.emp_id
and ea1.end_date >= ea2.start_date
and ea1.start_date <= ea1.start_date
Whole-day inetervals in date_time fields:
date_time_field >= begin_date and date_time_field < end_date_plus_1
Just found another interesting use of an unequal join on the MCTS 70-433 (SQL Server 2008 Database Development) Training Kit book. Verbatim below.
By combining derived tables with unequal joins, you can calculate a variety of cumulative aggregates. The following query returns a running aggregate of orders for each salesperson (my note - with reference to the ubiquitous AdventureWorks sample db):
select
SH3.SalesPersonID,
SH3.OrderDate,
SH3.DailyTotal,
SUM(SH4.DailyTotal) RunningTotal
from
(select SH1.SalesPersonID, SH1.OrderDate, SUM(SH1.TotalDue) DailyTotal
from Sales.SalesOrderHeader SH1
where SH1.SalesPersonID IS NOT NULL
group by SH1.SalesPersonID, SH1.OrderDate) SH3
join
(select SH1.SalesPersonID, SH1.OrderDate, SUM(SH1.TotalDue) DailyTotal
from Sales.SalesOrderHeader SH1
where SH1.SalesPersonID IS NOT NULL
group by SH1.SalesPersonID, SH1.OrderDate) SH4
on SH3.SalesPersonID = SH4.SalesPersonID AND SH3.OrderDate >= SH4.OrderDate
group by SH3.SalesPersonID, SH3.OrderDate, SH3.DailyTotal
order by SH3.SalesPersonID, SH3.OrderDate
The derived tables are used to combine all orders for salespeople who have more than one order on a single day. The join on SalesPersonID ensures that you are accumulating rows for only a single salesperson. The unequal join allows the aggregate to consider only the rows for a salesperson where the order date is earlier than the order date currently being considered within the result set.
In this particular example, the unequal join is creating a "sliding window" kind of sum on the daily total column in SH4.
Dublicates;
SELECT
*
FROM
table a, (
SELECT
id,
min(rowid)
FROM
table
GROUP BY
id
) b
WHERE
a.id = b.id
and a.rowid > b.rowid;
If you wanted to get all of the products to offer to a customer and don't want to offer them products that they already have:
SELECT
C.customer_id,
P.product_id
FROM
Customers C
INNER JOIN Products P ON
P.product_id NOT IN
(
SELECT
O.product_id
FROM
Orders O
WHERE
O.customer_id = C.customer_id
)
Most often though, when I use a non-equijoin it's because I'm doing some kind of manual fix to data. For example, the business tells me that a person in a user table should be given all access roles that they don't already have, etc.
If you want to do a dirty join of two not really related tables, you can join with a <>.
For example, you could have a Product table and a Customer table. Hypothetically, if you want to show a list of every product with every customer, you could do somthing like this:
SELECT *
FROM Product p
JOIN Customer c on p.SKU <> c.SSN
It can be useful. Be careful, though, because it can create ginormous result sets.