SQL Distinct Pair Groupings

SQL Distinct Pair Groupings - sql

I am interested in manipulating my data like so:
My Source Data:
From | To | Rate
----------------
EUR | AUD | 1.5895
EUR | BGN | 1.9558
EUR | GBP | 0.7347
EUR | USD | 1.1151
GBP | AUD | 2.1633
GBP | BGN | 2.6618
GBP | EUR | 1.3610
GBP | USD | 1.5176
USD | AUD | 1.4254
USD | BGN | 1.7539
USD | EUR | 0.8967
USD | GBP | 0.6589
In regards to "distinct pairs", I consider the following to be "duplicates".
EUR | USD matches USD | EUR
EUR | GBP matches GBP | EUR
GBP | USD matches USD | GBP
I want my source data to be filtered such that it removes any 1 of the above "duplicates", such that my final table is 3 records less than the original. I do not care which record from the "duplicates" is kept or removed, just so long as only 1 is selected.
I have tried many variations of Joins, Exists, Except, Distinct, Group By, logical comparisons (< >) and I feel like I am so close with any given approach... but it just does not seem to click.
My favorite effort has involved inner joining on EXCEPT:
SELECT a.[FROM], a.[TO], a.[Rate]
FROM Table a
INNER JOIN
(
SELECT DISTINCT [From], [To]
FROM Table
EXCEPT
(
SELECT [TO] as [From], [From] as [To]
FROM Table
)
) b
ON a.[From] = b.[From] AND a.[To] = b.[To]
But alas, it removes all of the matched pairs.

I can suggest something very easy, if it doesn't matter which one of then you want, than you can pick only the one that his rate is bigger than 1 or on the contrary the one smaller. Each pare should be 1 rate bigger and one smaller (make sense) so
Select * from table where rate>1

One way to remove the duplicates that doesn't depend on the rates:
select s.*
from source s
where from < to
union all
select s.*
from source s
where to > from and
not exists (select 1 from source s2 where s.from = s2.to and s.to = s2.from);
Note: I did not put escape characters around from and to, although you would need them in your actual query.

Just to make it complete an DISTINCT ON solution:
SELECT DISTINCT ON(Least(from, to), Greatest(from, to)) *
FROM
source AS s1
ORDER BY Least(from, to), Greatest(from, to)

Related

Conditional Join Big Query

I am beginner with BigQuery and SQL in general. I have a query that looks like this:
SELECT
base.*
IF( regexp_contains(rate_name, 'usd'), price * ft.usd, IF(regexp_contains(rate_name, 'gbp'), price * ft.gbp, price )) AS converted_price
FROM base_table base
JOIN
finance_table ft
ON
base.date = ft.date
In short, I have a table with some data (base) and depending on the currency that is the price, I want to convert using the rate stored in another table. The table with the rates (finance_table) has data only for 2021 but the base_table has data for dates before that.
What I want to do is to use this query as is when the date exists in the finance_table, otherwise use the rates from 2021-01-01 (this first date of finance_table).
What I tried is to join on this:
ON
IF( ft.date IS NOT NULL, base.date = ft.date, ft.date = '2021-01-01')
However, this doesn't give me any results when I query for a random date from 2020. I am sure that the condition is wrong, so any ideas?
P.S. Another thing that would suffice is using fixed numbers, e.g. if the date doesn't exist, multiply the price with 0.85 or 1.15, but this would probably make things more complicated.
EDIT:
Tables look like this:
BASE:
DATE | PRODUCT_NAME | PRICE | RATE_NAME
2020-01-01| APPLE | 0.5 | usd
2021-01-01| ORANGE | 0.4 | gbp
FINANCE_TABLE:
DATE | USD | GBP
2021-01-01| 0.844 | 1.443
2021-01-02| 0.846 | 1.423
The final result should look like this, when I query for date = '2021-01-01'
DATE | PRODUCT_NAME| PRICE | RATE_NAME | CONVERTED_PRICE
2021-01-01 | ORANGE | 0.4 | gbp | 0.5772
The problem lies in the case where I query for dates that don't exist in the finance_table.

You can use two joins. A direct translation into your query is:
SELECT price
(CASE WHEN base.rate_name = 'usd'
THEN base.price * coalesce(ft.usd, ft1.usd)
WHEN base.rage_name = 'gbp'
THEN base.price * coalesce(ft.gbp, ft.gbp)
ELSE base.price
END) AS converted_price
FROM base_table base LEFT JOIN
finance_table ft
ON base.date = ft.date JOIN
finance_table ft1
ON ft1.date = DATE '2020-01-01';

sql server select value depend by date

I have 2 tables :
documents
(docId | date | value | currencyid | currencyRate | netvalue )
1 | 2017/07/30 | 777 | EUR | 4.55 | 150.66
2 | 2017/07/30 | 456 | EUR | 4.55 | 100.00
3 | 2017/07/29 | 440 | RON | 1.00 | 440.00
4 | 2017/07/28 | 999 | RON | 1.00 | 999.00
currencyrates
(only for currencyid = EUR)
(date | currencyRate)
2017/07/30 | 4.55
2017/07/29 | 4.53
2017/07/28 | 4.48
I need to extract by month, the total sum in euro for all documents. My problem is when I try to convert the local value (RON) from documents.value in EURO.
Example 1 in documents : when currencyid = EUR, netvalue is automaticaly calculated from value/currencyRate (in documents) and only what I need there is to extract documents.netvalue
the problem is :
Example 2 in documents : when currencyid = RON, netvalue is represented in RON and i need to convert it in EURO with the value at facturated date (not present date). So, I need to extract the currencyRate from currencyrates table for each date and use it in a CASE to divide value (in RON) at it.
and my query :
SELECT
p.name Client, year(d.date) AS Year, month(d.date) AS Month, CONVERT(DECIMAL(10,2),d.CurrencyNetValue) CurrencyNetValue, d.CurrencyId,
CASE
WHEN d.CurrencyId = 1 THEN d.CurrencyNetValue/(select top 1(c.CurrencyRate) from CurrencyRates c inner join documents d on d.date=c.date where c.CurrencyId=2)
WHEN d.CurrencyId = 2 THEN d.CurrencyNetValue
END
AS EuroNetValue
FROM documents d
inner join partners p ON d.partnerid = p.partnerid
WHERE d.doctypeid = 200
ORDER BY d.date DESC
the error is in subquery, where i try to return the value of currencyRate at the date of facture. I need to return only a number, not all column

Solved :
Needed to add an inner join for a new "table"
inner join (select * from CurrencyRates where CurrencyId = 2) as cr on d.Date = cr.Date

Crosstab of row - SQL - Oracle

I try to generate a matrix or crosstab with the rows below:
TBL_CURRENCY_PAIR
ID | ISO_1 | ISO_2
1 | EUR | USD
2 | JPY | USD
4 | GBP | USD
I'd like to obtain a oracle view that contains something like below:
VIEW_PAIR
|PAIR|
USD.USD
GBP.USD
EUR.USD
JPY.USD
USD.GBP
GBP.GBP
EUR.GBP
JPY.GBP
USD.EUR
GBP.EUR
EUR.EUR
JPY.EUR
USD.JPY
GBP.JPY
EUR.JPY
JPY.JPY
I have tried with inner join to obtain a recursivity but nothing...
thanks in advance for your help,
Have nice day.

Perhaps the following does what you want:
with c as (
select iso_1 as iso
from tbl_currency_pair
union
select iso_2
from tbl_currency_pair
)
select c1.iso || '.' || c2.iso
from c c1 cross join c c2;
This generates all unique combinations of the currencies in the pair table.

Calculations over Multiple Rows SQL Server

If I have data in the format;
Account | Period | Values
Revenue | 2013-01-01 | 5432
Revenue | 2013-02-01 | 6471
Revenue | 2013-03-01 | 7231
Costs | 2013-01-01 | 4321
Costs | 2013-02-01 | 5672
Costs | 2013-03-01 | 4562
And I want to get results out like;
Account | Period | Values
Margin | 2013-01-01 | 1111
Margin | 2013-02-01 | 799
Margin | 2013-03-01 | 2669
M% | 2013-01-01 | .20
M% | 2013-02-01 | .13
M% | 2013-03-01 | .37
Where Margin = Revenue - Costs and M% is (Revenue - Costs)/Revenue for each period.
I can see various ways of achieving this but all are quite ugly and I wanted to know if there was elegant general approach for these sorts of multi-row calculations.
Thanks
Edit
Some of these calculations can get really complicated like
Free Cash Flow = Margin - Opex - Capex + Change in Working Capital + Interest Paid
So I am hoping for a general method that doesn't require lots of joins back to itself.
Thanks

Ok, then just Max over a Case statement, like such:
with RevAndCost as (revenue,costs,period)
as
(
select "Revenue" = Max(Case when account="Revenue" then Values else null end),
"Costs" = MAX(Case when account="Costs" then values else null end),
period
from data
group by period
)
select Margin = revenue-costs,
"M%" = (revenue-costs)/nullif(revenue,0)
from RevAndCost

Use a full self-join with a Union
Select 'Margin' Account,
coalesce(r.period, c.period) Period,
r.Values - c.Values Values
From myTable r
Full Join Mytable c
On c.period = r.period
Union
Select 'M%' Account,
coalesce(r.period, c.period) Period,
(r.Values - c.Values) / r.Values Values
From myTable r
Full Join Mytable c
On c.period = r.period

Here I use a Common Table Expression to do a full outer join between two instances of your data table to pull in Revenue and Costs into 1 table, then select from that CTE.
with RevAndCost as (revenue,costs,period)
as
(
select ISNULL(rev.Values,0) as revenue,
ISNULL(cost.values,0) as costs,
ISNULL(rev.period,cost.period)
from data rev full outer join data cost
on rev.period=cost.period
)
select Margin = revenue-costs,
"M%" = (revenue-costs)/nullif(revenue,0)
from RevAndCost

I'd do it like this:
SELECT r.PERIOD, r.VALUES AS revenue, c.VALUES AS cost,
r.VALUES - c.VALUES AS margin, (r.VALUES - c.VALUES) / r.VALUES AS mPct
FROM
(SELECT PERIOD, VALUES FROM t WHERE
ACCOUNT = 'revenue') r INNER JOIN
(SELECT PERIOD, VALUES FROM t WHERE
ACCOUNT = 'costs') c ON
r.PERIOD = c.PERIOD

How can I represent a single row from result set as multiple rows?

Given for example a currency rates table with these columns (used 3 here, but in my situation there are about 30):
date | eur | usd | gbp
2010-01-28 | X | Y | Z
How do I convert it to this one (using row with the latest date):
currency | rate
eur | X
usd | Y
gbp | Z
I've come up with a query like this:
SELECT 'eur' AS currency, eur AS rate FROM rates WHERE date = (SELECT MAX(date) FROM rates)
UNION
SELECT 'usd' AS currency, usd AS rate FROM rates WHERE date = (SELECT MAX(date) FROM rates)
UNION
...
It's huge and ugly. Are there other solutions ?

Sometimes the easiest solution (if you want nice-looking queries) is to re-engineer the schema. It may well be that the best solution is to change your table to be:
date | currency | rate
-----------+----------+-----
2010-01-28 | eur | X
2010-01-28 | usd | Y
2010-01-28 | gbp | Z
with suitable indexes on date and currency for performance. That's the way it should be in 3NF since the rates depend on each other, violating the 3NF rule:
Every column must depend on the key, the whole key and nothing but the key, so help me Codd.
(I love that little ditty). Another alternative is to provide a view which does the same thing, then you query the view. It's no less work for the DBMS but your query at least looks prettier (the create view still looks ugly though).
Or you could just accept the fact that some queries look ugly, document it well, and move on :-)

Do you have to do it in SQL?
This is quite trivial using a programming language.
In PHP:
$q = mysql_query("
SELECT eur, usd, gbp
FROM rates
ORDER BY date DESC
LIMIT 1
");
$table = false;
if($val = mysql_fetch_array($q, MYSQL_ASSOC))
{
$table = array(
'eur' => $val['eur'],
'usd' => $val['usd'],
'gbp' => $val['gbp'],
);
}
echo "currency | rate\n";
echo "-----------------";
foreach($table as $cur => $rate)
echo $curr." | ".$rate."\n";

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Distinct Pair Groupings - sql

I can suggest something very easy, if it doesn't matter which one of then you want, than you can pick only the one that his rate is bigger than 1 or on the contrary the one smaller. Each pare should be 1 rate bigger and one smaller (make sense) so Select * from table where rate>1

Just to make it complete an DISTINCT ON solution: SELECT DISTINCT ON(Least(from, to), Greatest(from, to)) * FROM source AS s1 ORDER BY Least(from, to), Greatest(from, to)

Related

Conditional Join Big Query

sql server select value depend by date

Crosstab of row - SQL - Oracle

Calculations over Multiple Rows SQL Server

How can I represent a single row from result set as multiple rows?

Categories

Resources