MS Access sum of 2 table in one query - sql

I have 2 tables:
name "mfr"
name "pomfr"
Both have many columns, but some are same, and I want to sum of that similar column in one query based on one of them similar column group by
Data sample is
table1. mfr
rfno|ppic|pcrt
101 | 10| .30
102 | 15| .50
103 | 18| .68
table2 pomfr
rfno|ppic|pcrt
101 |100 | 1.15
102 | 50 | 1.50
103 | 0 | 0
and result in query should be
mfrquery
rfno|ppic|pcrt
101|110 |1.45
102| 65 |2.00
103| 18 | .68

I'll be somewhat nice. This probably isn't the most efficient method, but it'll work...
select* into #temp from table1
union
select* from table2
select id,sum(ppic) as ppic, sum(pcrt) as pcrt from #temp group by id
What this says is, select everything from table 1 and use a union to table two and place it in a temporary table called #temp. Filter this to the variables and ranges you need.
Then the 2nd part says, take the sum of ppic and the sum of pcrt from the #temp table and group it by the id.
Since you're new to SO, for future reference, SO people aren't mean, they just want to see you put forth some sort of effort into the problem, I've gotten help SEVERAL times here. Very helpful community! Best of luck to you!

Related

SQL Help - Join small lookup table where not all columns are required (and an other option)

I have one large table with transactions and a smaller lookup table with values I want to add based on 4 common columns. The trick here is not every combination of these 4 columns will exist in the lookup table and there are scenarios where I want it to stop checking and accept the match instead of going to the next column. I also have an "Other" option to default to if it doesn't match any of the options.
Table structures are something like this:
transaction_table
country, trans_id, store_type, store_name, channel, browser, purchase_amount, currency
lookup_table
country, store_name, channel, browser, trans_fee
The data could be something like this:
transaction_table:
country| trans_id| store_type |store_name |channel |browser |amt |currency
US | 001 | Big Box | Target | B&M |N/A |1.45 |USD
US | 002 | Big Box | Target | Online |Chrome |1.79 |USD
US | 003 | Small | Bob's Store| B&M |N/A |2.50 |USD
US | 004 | Big Box | Walmart | B&M |N/A |1.12 |USD
US | 005 | Big Box | Walmart | Online |Firefox |3.79 |USD
US | 006 | Big Box | Amazon | Online |IE |4.54 |USD
US | 007 | Small | Jim's Plc | B&M |IE |2.49 |USD
lookup_table:
country|store_name |channel |browser |trans_fee
US |Target |B&M |N/A |0.25
US |Target |Online | |0.15
US |Walmart | | |0.30
US |Other | | |0.45
So looking at the lookup_table data:
Row 1 is very specific and would be a match on all 4 of the join
columns.
Row 2 would not care what browser was used to shop at Target so
regardless of the "browser" value, the trans_fee should come back
the same (other stores may care though).
Row 3 is saying any transaction with a country='US' and the
store_name='Walmart', regardless of the rest of the join columns
would have the same trans_fee
Row 4 is the "other" scenario where it should look first at the
store_name column and if it doesn't find a match, go to Other.
The lookup_table data can change and may end up being time dependent (start_date and end_date columns added) so it really wouldn't be a good candidate for a long, complex CASE statement.
I was thinking of a combination of checking each column with an IF IN statement but I'm hoping there's a more straightforward conditional join type statement I can use to go column by column and have an other option.
Thanks!
edit: I didn't specify this but I want to basically return all of the data from transaction_table and add the corresponding trans_fee to each line.
You will need to use a conditional JOIN.
Something like this
SELECT *
FROM lookup_table
LEFT OUTER JOIN transaction_table
ON CASE WHEN lookup_table.store_name IS NOT NULL
THEN transacton_table.store_name = lookup_table.store_name END
Such partial matching is tricky. And your problem is not really that well set up. You seem to have NULLs in some columns and general values in others.
In any case, you can solve this by matching what you can and then using order by to get the best match. In your case, I think this looks like this:
select tt.*,
(select trans_fee
from lookup l
where l.country = tt.country and
l.store_name in ('other', tt.store_name) and
(l.channel = tt.channel or l.channel is null) and
(l.browser = tt.browser or l. browser is null)
order by (case when l.store_name = tt.store_name then 1 else 2 end),
(case when l.channel = tt.channel then 1 else 2 end),
(case when l.browser = tt.browser then 1 else 2 end)
fetch first 1 row only
) as trans_fee
from transaction_table tt;
This is generic SQL. But the same idea should work in any database.

Return all rows that match column 1 of a row selected by criteria in column 2

Unfortunately, I think I'm quite limited in what solutions I can apply. I'm doing this for work and I only have permissions to SELECT from tables through Access 2010. I can't update or create tables. I can't find useful information like what version of sql is on the backend, let alone access the database directly or use VBA.
Say we have a dataset like this (crude looking, sorry):
MemberID | StatusCd | Date Added
12345 | 200 | 08/01/2016
12345 | 300 | 09/01/2016
12345 | 400 | 10/01/2016
5646 | 400 | 10/01/2016
8946 | 100 | 07/01/2016
Now, this database is massive and it'll be a huge performance issue if I try to pull all members in the table and process it afterwards. What I want is to return all rows that share a MemberID where at least one row for that MemberID is StatusCd 300. For instance, if I wanted information about members that hit Status 300, the desired table would look like:
MemberID | StatusCd | Date Added
12345 | 200 | 08/01/2016
12345 | 300 | 09/01/2016
12345 | 400 | 10/01/2016
However, right now when I try use a SELECT command that uses WHERE StatusCd = 300, I only get the one row where that condition is met. I don't know if it will interfere with anything, but I'm currently joining this table to another table on the memberID to get a smaller, more relevant table set to work with. It would also be nice to omit rows that come prior to the StatusCd 300, but that's a small chunk of extra data that won't hurt too much to leave in.
Thanks for any help anybody can provide!
Edit: adjusted phrasing based on comment feedback.
To get all rows for a member where at least 1 row has a statuscd of 300 first select all the corresponding memberid with a sub-query then select everything with that memberid from the table:
select *
from t
where memberid in (
select memberid from t where statuscd = 300
)

SQL payments matrix

I want to combine two tables into one:
The first table: Payments
id | 2010_01 | 2010_02 | 2010_03
1 | 3.000 | 500 | 0
2 | 1.000 | 800 | 0
3 | 200 | 2.000 | 300
4 | 700 | 1.000 | 100
The second table is ID and some date (different for every ID)
id | date |
1 | 2010-02-28 |
2 | 2010-03-01 |
3 | 2010-01-31 |
4 | 2011-02-11 |
What I'm trying to achieve is to create table which contains all payments before the date in ID table to create something like this:
id | date | T_00 | T_01 | T_02
1 | 2010-02-28 | 500 | 3.000 |
2 | 2010-03-01 | 0 | 800 | 1.000
3 | 2010-01-31 | 200 | |
4 | 2010-02-11 | 1.000 | 700 |
Where T_00 means payment in the same month as 'date' value, T_01 payment in previous month and so on.
Is there a way to do this?
EDIT:
I'm trying to achieve this in MS Access.
The problem is that I cannot connect name of the first table's column with the date in the second (the easiest way would be to treat it as variable)
I added T_00 to T_24 columns in the second (ID) table and was trying to UPDATE those fields
set T_00 =
iif(year(date)&"_"&month(date)=2010_10,
but I realized that that would be to much code for access to handle if I wanted to do this for every payment period and every T_xx column.
Even if I would write the code for T_00 I would have to repeat it for next 23 periods.
Your Payments table is de-normalized. Those date columns are repeating groups, meaning you've violated First Normal Form (1NF). It's especially difficult because your field names are actually data. As you've found, repeating groups are a complete pain in the ass when you want to relate the table to something else. This is why 1NF is so important, but knowing that doesn't solve your problem.
You can normalize your data by creating a view that UNIONs your Payments table.
Like so:
CREATE VIEW NormalizedPayments (id, Year, Month, Amount) AS
SELECT id,
2010 AS Year,
1 AS Month,
2010_01 AS Amount
FROM Payments
UNION ALL
SELECT id,
2010 AS Year,
2 AS Month,
2010_02 AS Amount
FROM Payments
UNION ALL
SELECT id,
2010 AS Year,
3 AS Month,
2010_03 AS Amount
FROM Payments
And so on if you have more. This is how the Payments table should have been designed in the first place.
It may be easier to use a date field with the value '2010-01-01' instead of a Year and Month field. It depends on your data. You may also want to add WHERE Amount IS NOT NULL to each query in the UNION, or you might want to use Nz(2010_01,0.000) AS Amount. Again, it depends on your data and other queries.
It's hard for me to understand how you're joining from here, particularly how the id fields relate because I don't see how they do with the small amount of data provided, so I'll provide some general ideas for what to do next.
Next you can join your second table with this normalized Payments table using a method similar to this or a method similar to this. To actually produce the result you want, include a calculated field in this view with the difference in months. Then, create an actual Pivot Table to format your results (like this or like this) which is the proper way to display data like your tables do.

GROUP BY and SUMS in MS ACCESS

I'm trying to get a report of how many article have been sold, especially which one was sold more, both in terms of numbers and price.
I'm trying the above query, thinking that using the [PRICE]*[total] in the group by expression, it could worked. unluckily it does not. I've try also to put the alias in the group by expression, but nothing more, it only says that I need to use a grouping expression for the column: [PRICE]*[total] which is what I thought I have done.
SELECT TOP 20 ARTIC, Sum(TOTGIA) AS total, [PRICE]*[total] AS a
FROM Car
GROUP BY ARTIC, [PRICE]*[total]
ORDER BY Sum(TOTGIA) DESC;
anyone could lead me in the good direction?
the error is:
"You tried to execute a query that does not include the specified expression '[PRICE]*[total]' as part of an aggregate function."
the table is something like this:
|artic|totgia|price
+++++++++++++++++++
|aaa | 1 | 10
|aaa | 4 | 10
|bbb | 1 | 200
I would like to have:
|aaa| 5 | 50
|bbb| 1 | 200
so aaa is the first one for number of sells, but bbb is first for cash
The problem here is that you are trying to use the total alias in the select and in the group by. You do not have access to the alias at this time. Instead, you will either need to refer to the actual column values in place of total. In other cases, you can create a subselect and use the alias, but this does not apply to your query as it is written.
SELECT TOP 20 ARTIC, Sum(TOTGIA) AS total, PRICE*Sum(TOTGIA) AS a
FROM Car
GROUP BY ARTIC, PRICE
ORDER BY Sum(TOTGIA) DESC;
If you have an article listed with several different prices, this query will return several rows. So, this data:
|artic|totgia|price
+++++++++++++++++++
|aaa | 1 | 10
|aaa | 4 | 20
|bbb | 1 | 200
Would return these results:
|aaa| 1 | 10
|aaa| 4 | 80
|bbb| 1 | 200
This would happen because we have specifically told sql that we want the unique articles and prices as their own rows. However, this is probably a good thing because in the above scenario, you wouldn't want to return that aaa has a quantity of 5 with a value of 50, since the total value is 90. If this is a possible scenario for your data, you would make this query into a subselect and group all the data for the unique articles together.

Teradata-replacing self join

I have table in Teradata and have trillion of record.
Temp- with cat_nbr as PI
Cat_nbr | brand_Nbr |card_nbr
1 | 10 | 100
1 | 10 |101
1 |20 | 100
1 | 20 | 102
2 |10 | 100
2 | 10 |103
2 |30 |100
2 |30 |105
3 |40 |106
3 | 30 |107
I need to find out categories total no of customer for a particular brand.
Just an ex. for brand no:10
First we need to check which cat have brand no 10, in this cat 1,2 have it.
Then for all cutomer in cat 1,2 ; we need count(distinct card_no).
result shoul be like
brand_nbr|total_cust
10 | 5
I have written the below query to achive this:-
select k.brand_nbr,count(distinct l.card_nbr)
from temp k join temp l on k.cat_nbr=l.cat_nbr
group by 1;
It give me proper result but the thing , we have trillion of records in table and when I do run the query it goes on processing like more than 2 hrs.
I need a solution to improve the performance so that it can max in 30 min.
I have checked the amps , there are 16 amps for my database.
Please masters help me out if you have any solution for this.
Thanks in advance.
The only other approach I can think of is using two steps:
-- This will remove duplicates
CREATE VOLATILE SET TABLE vt AS
(
SELECT k.brand_nbr,l.card_nbr
FROM temp k JOIN temp l ON k.cat_nbr=l.cat_nbr
)
WITH DATA
PRIMARY INDEX(brand_nbr)
ON COMMIT PRESERVE ROWS;
-- Now you can simply count without distinct
SELECT brand_nbr, COUNT(*)
FROM vtab
GROUP BY 1;
Depending on your data (number of rows per cat_nbr/brand_nbr) this might be faster. Or slower and totally skewed :-)
Btw, I doubt you store 1 trillion rows on a 16 AMP system, this is at least 30TB, maybe 16 nodes
If you don't want to create volatile table as a set (as dnoeth suggested), try using an ordered analytical function:
SELECT DISTINCT
k.brand_Nbr,
COUNT(l.card_nbr) OVER(PARTITION BY k.brand_Nbr) AS cnt
FROM temp k JOIN temp l ON k.cat_nbr=l.cat_nbr
Ordered analytical functions don't need GROUP BY statement. I am not really sure if it would be actually better than a volatile table regarding performance (since a volatile table mentioned in dnoeth's solution also uses indexing, which theoretically should be better for Teradata), but you can give it a try.