To calculate Number of days difference between two tables entries and update it in another table using SQL server - sql-server-2012

enter image description hereI have two tables A and B where, A contains todays data only and B contains historical data,
both tables have same attributes- pcode, product, market, pceneter, date. where All columns have same entry except date as for table A date is todays date and for B date is the first entry date in the table
I am trying to calculate day difference for A table's records available in B
Example: Table A
code product market center date No.of Days
X1 abcd IT04 2G 17/9/2021 0
X1 efgh ER90 MB 17/9/2021 0
Y5 ijkl OK09 MB 17/9/2021 0
Table B
code product market center date
X1 abcd IT04 2G 15/9/2021
X1 efgh ER90 MB 16/9/2021
X1 abcd IT04 2G 11/9/2021
X1 efgh ER90 C8 11/9/2021
expected output - Table A No of days should be updated with date difference for each reco available in talbe B
Example: Table A
code product market center date No.of Days
X1 abcd IT04 2G 17/9/2021 6
X1 efgh ER90 MB 17/9/2021 1
Y5 ijkl OK09 MB 17/9/2021 0

my solution:
1. I grouped the table B to get the minimum date with other columns.
2. Afterwards, i joined this subquery and table B to match and subtruct dates
Note: i may did some syntax problems, since i dont have any data / table to test it. But i hope you get the logic...
select a.*, DATEDIFF(day, a.date, h.date) as no_of_days
from table_a a
left join (select code, product, market, center, min(date) date
from table_b
group by code, product, market, center) h
on a.code = h.code
and a.product = h.product
and a.market = h.market
and a.center = h.center

Related

Create a column of sum of values after grouping two columns in Power BI

I am trying to sum values of third column based on first two columns and enter in new column.
Day Product type price total
1/1/2019 A1 T1 3 8
1/1/2019 A1 T2 5 8
1/2/2019 A2 T1 2 3
1/2/2019 A2 T2 1 3
1/1/2019 B1 T1 4 12
1/1/2019 B1 T2 7 12
1/2/2019 B2 T1 3 5
1/2/2019 B2 T2 2 5
1/3/2019 A1 T2 2 8
1/4/2019 A2 T1 9 11
1/3/2019 B1 T1 6 11
1/3/2019 B1 T2 5 11
1/4/2019 B2 T1 4 4
Total is sum of price regardless of type and unique as combination of date Product. check these excel columns
It is normally not recommended to add a column for summarized values. Summarization is supposed to be done with measures.
It is very easy to get the Total for each Day and Product. First you will define a measure. In the Modeling tab, click New Measure and type Total = SUM(Sales[Price]). I'm assuming the name of your table to be "Sales", so you need to replace it with your own table name.
Then in the report, choose an appropriate visualization and drag and drop Day, Product, and Total. The measure Total calculates the sum of Price for each Day and Product on the fly.
It is also possible to keep the Total of Day and Product in a column inside the model. However, this is not a best practice. Before doing this, try to find a way with measures, and only do this if you are an experienced user and you know there is some good reason to do this.
In this case, in the Modeling tab, click New Column and input this formula.
Total of Day and Product = CALCULATE(
SUM(Sales[Price]),
ALLEXCEPT(Sales, Sales[Day], Sales[Product])
)
Go to the Edit Queries > Add Column > Custom Column and use something like this:
= if [Product] = "A1" and [type] = "T1" then [price] * [total] else [price] * [total] * 2
This calcualation is just an example how its done because you didnt provide any information what your criterias are to sum the values in the third column. But with this example you should be able to create your new column by yourself.

SQLite: Divide value from one column based on criteria from other columns

I have a table in SQLite3 with the following structure:
Date Category Value
------------ -------------- -------------
20160101 A 5
20160101 B 3
20160102 A 4
20160102 B 2
20160103 A 7
20160103 B 3
20160104 A 8
20160104 B 1
My goal is to select values from the table so that for each date I divide the value of category A with the value of category B. I have exactly one value for each category for every date. I.e. the goal is to select two columns with these values:
Date NewValue(A/B)
------------ --------------
20160101 1.6667
20160102 2
20160103 2.3333
20160104 8
I have tried to solve this by creating a temporary table, but I get wrong values.
You can do this using conditional aggregation or a join:
select t.date, ta.value / tb.value
from t ta join
t tb
on ta.date = tb.date and ta.category = 'A' and tb.category = 'B';
One caveat: SQLite does integer division. So, if the values are integers, you should use something like:
select t.date, ta.value * 1.0 / tb.value

Select n amount of random rows where n is proportionate to each value's % of total population

I have a table of 58 million customer records. Each customer has a market value (EN, US, FR etc.)
I'm trying to select a 100k sample set which contains customers from every market. The ratio of customers per market in the sample must match the ratios in the actual table.
So if UK customers account for 15% of the records in the customer table then there must be 15k UK customers in the 100k sample set and the same then for each market.
Is there a way to do this?
First, a simple random sample should do pretty well on representing the market sizes. What you are asking for is a stratified sample.
One way to get such a sample is to order the data randomly and assign a sequential number in each group. Then normalize the sequential number to be between 0 and 1, and finally order by the normalized value and choose the top "n" rows:
select top 100000 c.*
from (select c.*,
row_number() over (partition by market order by rand(checksum(newid()))
) as seqnum,
count(*) over (partition by market) as cnt
from customers c
) c
order by cast(seqnum as float) / cnt
It may be clear what is happening if you look at the data. Consider taking a sample of 5 from:
1 A
2 B
3 C
4 D
5 D
6 D
7 B
8 A
9 D
10 C
The first step assigns a sequential number randomly within each market:
1 A 1
2 B 1
3 C 1
4 D 1
5 D 2
6 D 3
7 B 2
8 A 2
9 D 4
10 C 2
Next, normalize these values:
1 A 1 0.50
2 B 1 0.50
3 C 1 0.50
4 D 1 0.25
5 D 2 0.50
6 D 3 0.75
7 B 2 1.00
8 A 2 1.00
9 D 4 1.00
10 C 2 1.00
Now, if you take the top 5, you will get the first five values which is a stratified sample.
Using a sample that big a casual extraction will give you a sample with good statitical approximation of the original population, as pointed out by Gordon Linoff.
To force the equal percentage between the population and the sample you can calculate and use all the needed parameter: the dimension of the population and the dimension of the partition, with the addition of a random ID.
Declare #sampleSize INT
Set #sampleSize = 100000
With D AS (
SELECT customerID
, Country
, Count(customerID) OVER (PARTITION BY Null) TotalData
, Count(customerID) OVER (PARTITION BY Country) CountryData
, Row_Number() OVER (PARTITION BY Country
ORDER BY rand(checksum(newid()))) ID
FROM customer
)
SELECT customerID
, Country
FROM D
WHERE ID <= Round((Cast(CountryData as Float) / TotalData) * #sampleSize, 0)
ORDER BY Country
SQLFiddle demo with less data.
Be aware that the approximation of the function in the WHERE condition can make the returned data a little less or a little more of the desired one, for example in the demo the rows returned are 9 instead of 10.

Check rows for differences between 2 dates for each specifc name

So I have a table that consists of about 7 columns. Everyday I copy info from an access database into a sql table and throw a date on each record.
What I am looking to do is compare for instance today records to yesterdays records and check for any changes between the names.
Hope the make shift table below may help understand the question. In the example everyday three records get dumps in aa,bb,cc are the name. I want to be able to query if any information for "aa" has changed between 2 dates.
Table ID Name Info1 Info2 AD PH Date
1 aa yg yg a a 10/17
2 bb hg hg a a 10/17
3 cc hg po a a 10/17
4 aa yg yg a a 10/18
5 bb hk hg a a 10/18
6 cc hg po a a 10/18
select date
from your_table
where date between '2013-10-16' and '2013-10-17'
and name = 'aa'
group by date
having count(distinct info1) > 1
or count(distinct info2) > 1
or count(distinct AD) > 1
or count(distinct PH) > 1

deleting values with overlapping dates

I have table in an MS Access database that looks like this:
ID Symbol Direction Start_val End_val AW
1 ABC up 100 120 10
2 ABC up 110 130 11
3 XYZ down 350 380 15
4 XYZ down 340 390 15
I am trying to delete duplicate symbols and directions that have overlapping start_val and end_val and the highest AW. For example in the table above the data matching id 1 has a start_val and end_val that overlap the start_val and end_val of id 2. Since id 1 has a smaller AW i want to delete that. For id 3 and 4, the start_val and end_val overlap but the AW is the same, so the smallest id is deleted.
This should do the trick:
delete tablename
from tablename t1
inner join tablename t2 on t1.Symbol = t2.Symbol
and t1.Direction = t2.Direction
and t1.Start_val >= t2.Start_val
and t1.End_val <= t2.End_Val
and t1.AW <= t2.AW
Making the inner join with the same table with the restrictions:
equal Symbol;
equal Direction;
Bigger (or equal) Start_val;
Lesser (or equal) End_val;
Lesser (or equal) AW.
will get you the list of rows that you want to delete.