Merge Value Of Multiple Record based on similar criteria in Access/SQL/Excel - sql

Currently I have a Table of Rows looked like this
I would like to merge all Rows with same FlNo to a single Row, the data of merged row follow by these criteria:
'FlNo' remain the same
'Start' would be the earliest Date
'End' would be the lastest date
'Pattern' would represent the day of week, so it would be combination of all day of weeks that appeared in every rows (ie. if Row 1 have Pattern = "12347", Row 2 = "34567", combined Pattern would = "1234567", ie2: If Row1 = "357", Row2 = "357", combined Pattern would remain the same = "357"). This part has bothered me most as I haven't found the algorithm to solve it.
'AC_Name' would be the value which appeared most time for a FlNo (in this case would be 32)
So the Final Row would be
FlNo | Start | End | Pattern | AC_Name |
660 | 26/Mar/2017 | 28/Oct/2017 | 1234567 | 32 |
As the original Data is an Excel Spreadsheet so the solution should be provided based on Excel (VBA)/Access (VBA/SQL) environment. It could process in Excel first then Import to Access or Import to Access then process in there or half/half). Personally I would prefer to process in Access and SQL as there is about 13000s Rows of Data.
Please help me to find a solution to process this data. Thank you guys a lot.

once you have properly fixed you data structure for you pattern column
you could use min(), max() and group by .. united to a selected table with max for count
select
t1.FlNo
, min(t1.Start )
, max( t1.End)
, max(D1)
, max(D2)
, max(D3)
, max(D4)
, max(D5)
, max(D6)
, max(D7)
, t2.AC_Name
from my_table t1
INNER JOIN (
select FlNo, AC_Name, max(my_count) from (
select FlNo, AC_Name , count(*) AS my_count
from my_table
group by FlNo, AC_Name ) t
GROUP BY lNo, AC_Name
having my_count = max(my_count)
) t2 on t1.FlNo = t2.FlNo

Once you have fixed the data, the query for all but Ac_Name would simply be:
select FINo, min(start), max(end),
max(IsMonday), max(IsTuesday), . . .
from t
group by FINo;
Getting Ac_Name is tricky. This should work:
select FINo, min(start), max(end),
max(IsMonday), max(IsTuesday), . . .,
(select top 1 ac_name
from t as t2
where t2.FINo = t.FINo
group by ac_name
order by count(*) desc, ac_name
) as ac_name
from t
group by FINo;

Related

Group table by custom column

I have a table transaction_transaction with columns:
id, status, total_amount, date_made, transaction_type
The status can be: Active, Paid, Trashed, Renewed, Void
So what i want to do is filter by date and status, but since sometimes there are no records with Renewed or Trashed, i get inconsistent data it returns only Active and Paid when grouping by status ( notice Renewed and Trashed is missing ). I want it allways to return smth like:
-----------------------------------
Active | 121 | 2017-08-09
Paid | 122 | 2017-08-19
Trashed | 123 | 2017-08-20
Renewed | 123 | 2017-08-20
The sql query i use:
SELECT
ST.type,
COALESCE(SUM(TR.total_amount), 0) AS amount
FROM sms_admin_status ST
LEFT JOIN transaction_transaction TR ON TR.status = ST.type
WHERE TR.store_id = 21 AND TR.transaction_type = 'Layaway' AND TR.status != 'Void'
AND TR.date_made >= '2018-02-01' AND TR.date_made <= '2018-02-26'
GROUP BY ST.type
Edit: I created a table sms_admin_status since you said its bad not having a table and in the future i might have new statuses, and i also changed the query to fit my needs.
Use a VALUES list in a subquery to LEFT JOIN your transaction table. You may need to NULLIF your sums to have them return 0.
https://www.postgresql.org/docs/10/static/queries-values.html
One possible solution (not very nice one) is the following
select statuses.s, date_made, coalesce(SUM(amount), 0)
from (values('active'),('inactive'),('deleted')) statuses(s)
left join transactions t on statuses.s = t.status and
date_made >= '2017-08-08'
group by statuses.s, date_made
I assume that you forgot to add date_made to the group by. therefore, I added it there. As you can see the possible values are hardcoded in the SQL. Some other solution (much more cleaner) is to create a table with possible values of status and replace my statuses.
Use SELECT ... FROM (VALUES) with restriction from the transaction table:
select * from (values('active', 0),('inactive', 0),('deleted', 0)) as statuses
where column1 not in (select status from transactions)
union select status, sum(amount) from transactions group by status
Add the date column as need be, I assume it's a static value
The multiple where statements will limit the rows selected unless they are in a sub-query. May I suggest something like the following?
SELECT ST.type, ISNULL(SELECT SUM(TR.total_amount)
FROM transaction_transaction TR
WHERE TR.status = ST.type AND TR.store_id = 21 AND TR.transaction_type = 'Layaway' AND TR.status != 'Void'
AND TR.date_made >= '2018-02-01' AND TR.date_made <= '2018-02-26'),0) AS amount
FROM sms_admin_status ST
GROUP BY ST.type

TSQL syntax to feed results into subquery

I'm after some help on how best to write a query that does the following. I think I need a subquery but I don't know how to use the data returned in the row to feed back into the subquery without hardcoding values? A subquery may not be the right thing here?
Ideally I only want 1 variable ...WHERE t_Date = '2018-01-01'
Desired Output:
The COUNT Criteria column has the following rules
Date < current row
Area = current row
Name = current row
Value = 1
For example, the first row indicates there are 2 records with Date < '2018-01-01' AND Area = 'Area6' AND Name = 'Name1' AND Value = 1
Example Data:
SQLFiddle: http://sqlfiddle.com/#!18/92ba3/4
Effectively I only want to return the first 2 rows but summarise the historic data into a column based on the output in that column.
The right way to do this is to use the cumulative sum functionality in ANSI SQL and SQL Server since 2012:
select t.*,
sum(case when t.value = 1 then 1 else 0 end) over (partition by t_area, t_name order by t_date)
from t;
This actually includes the current row. If you have only one row per date (for the area/name combo), then you can just subtract it or use a windowing clause:
select t.*,
sum(case when t.value = 1 then 1 else 0 end) over
(partition by t_area, t_name
order by t_date
rows between unbounded preceding and 1 preceding
)
from t;
Use a self join to find records in the same table that are related to a particular record:
SELECT t1.t_Date, t1.t_Area, t1.t_Name, t1.t_Value,
COUNT(t2.t_Name) AS COUNTCriteria
FROM Table1 as t1
LEFT OUTER JOIN Table1 as t2
ON t1.t_Area=t2.t_Area
AND t1.t_Name=t2.T_Name
AND t2.t_Date<t1.t_Date
AND t2.t_Value=1
GROUP BY t1.t_Date, t1.t_Area, t1.t_Name, t1.t_Value

Percentage difference between numbers in two columns

My SQL experience is fairly minimal so please go easy on me here. I have a table tblForEx and I'm trying to create a query that looks at one particular column LastSalesRateChangeDate and also ForExRate.
Basically what I want to do is for the query to check that LastSalesRateChangeDate and then pull the ForExRate that is on the same line (obviously in the ForExRate column), then I need to check to see if there is a +/- 5% change since the last time the LastSalesRateChangeDate changed. I hope this makes sense, I tried to explain it as clearly as possible.
I believe I would need to create a 'subquery' to look at the LastSalesRateChangeDate and pull the ForEx rate from that date, but I just don't know how to go about this.
I should add this is being done in Access (SQL)
Sample data, here is what the table looks like:
| BaseCur | ForCur | ForExRate | LastSalesRateChangeDate
| USD | BRL | 1.718 | 12/9/2008
| USD | BRL | 1.65 | 11/8/2008
So I would need a query to look at the LastSalesRateChangeDate column, check to see if the date has changed, if so take the ForExRate value and then give a percentage difference of that ForExRate value since the last record.
So the final result would likely look like
"BaseCur" "ForCur" "Percentage Change since Last Sales Rate Change"
USD BRL X%
Gordon's answer pointed in the right direction:
SELECT t2.*, (SELECT top 1 t.ForExRate
FROM tblForEx t
where t.BaseCur=t2.BaseCur AND t.ForCur=t2.ForCur and t.LastSalesRateChangeDate<t2.LastSalesRateChangeDate
order by t.LastSalesRateChangeDate DESC, t.ForExRate DESC
) AS PreviousRate, [ForExRate]/[PreviousRate]-1 AS ChangeRatio
FROM tblForEx AS t2;
Access gives errors where the TOP 1 in the subquery causes "ties". We broke the ties and therefore removed the error by adding an extra item to the ORDER BY clause. To get the ratio to display as a percentage, switch to the design view and change the properties of that column accordingly.
If I understand correctly, you want the previous value. In MS Access, you can use a correlated subquery:
select t.*,
(select top (1) t2.LastSalesRateChangeDate
from tblForEx as t2
where t2.BaseCur = t.BaseCur and t2.ForCur = t.ForCur
t2.LastSalesRateChangeDate < t.LastSalesRateChangeDate
order by t2.LastSalesRateChangeDate desc
) as prev_LastSalesRateChangeDate
from t;
Now, with this as a subquery, you can get the previous exchange rate using a join:
select t.*, ( (t.ForExRate / tprev.ForExRate) - 1) as change_ratio
from (select t.*,
(select top (1) t2.LastSalesRateChangeDate
from tblForEx as t2
where t2.BaseCur = t.BaseCur and t2.ForCur = t.ForCur
t2.LastSalesRateChangeDate < t.LastSalesRateChangeDate
order by t2.LastSalesRateChangeDate desc
) as prev_LastSalesRateChangeDate
from t
) as t inner join
tblForEx as tprev
on tprev.BaseCur = t.BaseCur and tprev.ForCur = t.ForCur
tprev.LastSalesRateChangeDate = t.prev_LastSalesRateChangeDate;
As per my understanding, you can use LEAD function to get last changed date Rate in a new column by using below query:
WITH CTE AS (
SELECT *, LEAD(ForExRate, 1) OVER(PARTITION BY BaseCur, ForCur ORDER BY LastChangeDate DESC) LastValue
FROM #TT
)
SELECT BaseCur, ForCur, ForExRate, LastChangeDate , CAST( ((ForExRate - ISNULL(LastValue, 0))/LastValue)*100 AS float)
FROM CTE
Problem here is:
for every last row in group by you will have new calculalted column which we have made using LEAD function.
If there is only a single row for a particular BaseCur and ForCur, then also you will have NULL in column.
Resolution:
If you are sure that there will be at least two rows for each BaseCur and ForCur, then you can use WHERE clause to remove NULL values in final result.
WITH CTE AS (
SELECT *, LEAD(ForExRate, 1) OVER(PARTITION BY BaseCur, ForCur ORDER BY LastChangeDate DESC) LastValue
FROM #TT
)
SELECT BaseCur, ForCur, ForExRate, LastChangeDate , CAST( ((ForExRate - ISNULL(LastValue, 0))/LastValue)*100 AS float) Percentage
FROM CTE
WHERE LastValue IS NOT NULL
SELECT basetbl.BaseCur, basetbl.ForCur, basetbl.NewDate, basetbl.OldDate, num2.ForExRate/num1.ForExRate*100 AS PercentChange FROM
(((SELECT t.BaseCur, t.ForCur, MAX(t.LastSalesRateChangeDate) AS NewDate, summary.Last_Date AS OldDate
FROM (tblForEx AS t
LEFT JOIN (SELECT TOP 2 BaseCur, ForCur, MAX(LastSalesRateChangeDate) AS Last_Date FROM tblForEx AS t1
WHERE LastSalesRateChangeDate <>
(SELECT MAX(LastSalesRateChangeDate) FROM tblForEx t2 WHERE t2.BaseCur = t1.BaseCur AND t2.ForCur = t1.ForCur)
GROUP BY BaseCur, ForCur) AS summary
ON summary.ForCur = t.ForCur AND summary.BaseCur = t.BaseCur)
GROUP BY t.BaseCur, t.ForCur, summary.Last_Date) basetbl
LEFT JOIN tblForEx num1 ON num1.BaseCur=basetbl.BaseCur AND num1.ForCur = basetbl.ForCur AND num1.LastSalesRateChangeDate = basetbl.OldDate))
LEFT JOIN tblForEx num2 ON num2.BaseCur=basetbl.BaseCur AND num2.ForCur = basetbl.ForCur AND num2.LastSalesRateChangeDate = basetbl.NewDate;
This uses a series of subqueries. First, you are selecting the most recent date for the BaseCur and ForCur. Then, you are joining onto that the previous date. I do that by using another subquery to select the top two dates, and exclude the one that is equal to the previously established most recent date. This is the "summary" subquery.
Then, you get the BaseCur, ForCur, NewDate, and OldDate in the "basetbl" subquery. After that, it is two simple joins of the original table back onto those dates to get the rate that was applicable then.
Finally, you are selecting your BaseCur, ForCur, and whatever formula you want to use to calculate the rate change. I used a simple ratio in that one, but it is easy to change. You can remove the dates in the first line if you want, they are there solely as a reference point.
It doesn't look pretty, but complicated Access SQL queries never do.

select different Max ID's for different customer

situation:
we have monthly files that get loaded into our data warehouse however instead of being replaced with old loads, these are just compiled on top of each other. the files are loaded in over a period of days.
so when running a SQL script, we would get duplicate records so to counteract this we run a union over 10-20 'customers' and selecting Max(loadID) e.g
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'ASDA')
UNION
SELECT
Customer
column 2
column 3
FROM
MyTable
WHERE
LOADID = (SELECT MAX (LOADID) FROM MyTable WHERE Customer= 'TESCO'
The above union would have to be done for multiple customers so i was thinking surely there has to be a more efficient way.
we cant use a MAX (LoadID) in the SELECT statement as a possible scenario could entail the following;
Monday: Asda,Tesco,Waitrose loaded into DW (with LoadID as 124)
Tuesday: Sainsburys loaded in DW (with LoadID as 125)
Wednesday: New Tesco loaded in DW (with LoadID as 126)
so i would want LoadID 124 Asda & Waitrose, 125 Sainsburys, & 126 Tesco
Use window functions:
SELECT t.*
FROM (SELECT t.*, MAX(LOADID) OVER (PARTITION BY Customer) as maxLOADID
FROM MyTable t
) t
WHERE LOADID = maxLOADID;
Would a subquery to a derived table meet your needs?
select yourfields
from yourtables join
(select customer, max(loadID) maxLoadId
from yourtables
group by customer) derivedTable on derivedTable.customer = realTable.customer
and loadId = maxLoadId

SQL PIVOT TABLE

I have the following data:
ID Data
1 tera
1 add
1 alkd
2 adf
2 add
3 wer
4 minus
4 add
4 ten
I am trying to use a pivot table to push the rows into 1 row with multiple columns per ID.
So as follows:
ID Custom1 Custom2 Custom3 Custom4..........
1 tera add alkd
2 adf add
3 wer
4 minus add ten
I have the following query so far:
INSERT INTO #SpeciInfo
(ID, [Custom1], [Custom2], [Custom3], [Custom4], [Custom5],[Custom6],[Custom7],[Custom8],[Custom9],[Custom10],[Custom11],[Custom12],[Custom13],[Custom14],[Custom15],[Custom16])
SELECT
ID,
[Custom1],
[Custom2],
[Custom3],
[Custom4],
[Custom5],
[Custom6],
[Custom7],
[Custom8],
[Custom9],
[Custom10],
[Custom11],
[Custom12],
[Custom13],
[Custom14],
[Custom15],
[Custom16]
FROM SpeciInfo) p
PIVOT
(
(
[Custom1],
[Custom2],
[Custom3],
[Custom4],
[Custom5],
[Custom6],
[Custom7],
[Custom8],
[Custom9],
[Custom10],
[Custom11],
[Custom12],
[Custom13],
[Custom14],
[Custom15],
[Custom16]
)
) AS pvt
ORDER BY ID;
I need the 16 fields, but I am not exactly sure what I do in the From clause or if I'm even doing that correctly?
Thanks
If what you seek is to dynamically build the columns, that is often called a dynamic crosstab and cannot be done in T-SQL without resorting to dynamic SQL (building the string of the query) which is not recommended. Instead, you should build that query in your middle tier or reporting application.
If you simply want a static solution, an alternative to using PIVOT of what you seek might look something like so in SQL Server 2005 or later:
With NumberedItems As
(
Select Id, Data
, Row_Number() Over( Partition By Id Order By Data ) As ColNum
From SpeciInfo
)
Select Id
, Min( Case When Num = 1 Then Data End ) As Custom1
, Min( Case When Num = 2 Then Data End ) As Custom2
, Min( Case When Num = 3 Then Data End ) As Custom3
, Min( Case When Num = 4 Then Data End ) As Custom4
...
From NumberedItems
Group By Id
One serious problem in your original data is that there is no indicator of sequence and thus there is no means for the system to know which item for a given ID should appear in the Custom1 column as opposed to the Custom2 column. In my query above, I arbitrarily ordered by name.