Here is the dataframe in question:
|City|District|Population| Code | ID |
| A | 4 | 2000 | 3 | 21 |
| A | 8 | 7000 | 3 | 21 |
| A | 38 | 3000 | 3 | 21 |
| A | 7 | 2000 | 3 | 21 |
| B | 34 | 3000 | 6 | 84 |
| B | 9 | 5000 | 6 | 84 |
| C | 4 | 9000 | 1 | 28 |
| C | 21 | 1000 | 1 | 28 |
| C | 32 | 5000 | 1 | 28 |
| C | 46 | 20 | 1 | 28 |
I want to regroup the population counts by city to have this kind of output:
|City|Population| Code | ID |
| A | 14000 | 3 | 21 |
| B | 8000 | 6 | 84 |
| C | 15020 | 1 | 28 |
df = df.groupby(['City', 'Code', 'ID'])['Population'].sum()
You can make a group by 'City', 'Code' and 'ID then make sum of 'population'.
+----+---------------+--------------------+------------+----------+-----------------+
| id | restaurant_id | filename | is_profile | priority | show_in_profile |
+----+---------------+--------------------+------------+----------+-----------------+
| 40 | 20 | 1320849687_390.jpg | | | 1 |
| 60 | 24 | 1320853501_121.png | 1 | | 1 |
| 61 | 24 | 1320853504_847.png | | | 1 |
| 62 | 24 | 1320853505_732.png | | | 1 |
| 63 | 24 | 1320853505_865.png | | | 1 |
| 64 | 29 | 1320854617_311.png | 1 | | 1 |
| 65 | 29 | 1320854617_669.png | | | 1 |
| 66 | 29 | 1320854618_636.png | | | 1 |
| 67 | 29 | 1320854619_791.png | | | 1 |
| 74 | 154 | 1320922653_259.png | | | 1 |
| 76 | 154 | 1320922656_332.png | | | 1 |
| 77 | 154 | 1320922657_106.png | | | 1 |
| 84 | 130 | 1321269380_960.jpg | 1 | | 1 |
| 85 | 130 | 1321269383_555.jpg | | | 1 |
| 86 | 130 | 1321269384_251.jpg | | | 1 |
| 89 | 28 | 1321269714_303.jpg | | | 1 |
| 90 | 28 | 1321269716_938.jpg | 1 | | 1 |
| 91 | 28 | 1321269717_147.jpg | | | 1 |
| 92 | 28 | 1321269717_774.jpg | | | 1 |
| 93 | 28 | 1321269717_250.jpg | | | 1 |
| 94 | 28 | 1321269718_964.jpg | | | 1 |
| 95 | 28 | 1321269719_830.jpg | | | 1 |
| 96 | 43 | 1321270013_629.jpg | 1 | | 1 |
+----+---------------+--------------------+------------+----------+-----------------+
I have this table and I want to select the filename for a given list of restaurants ids.
For example for 24,29,154:
+----+---------------
| filename |
+----+---------------
1320853501_121.png (has is_profile 1)
1320854617_311.png (has is_profile 1)
1320922653_259.png (chosen as profile picture because restaurant doesn't have a profile pic but has pictures)
I tried group by and case statements but I got nowhere.Also if you use group by it should be a full group by.
You can do this with aggregation and some logic:
select restaurant_id,
coalesce(max(case when is_profile = 1 then filename end),
max(filename)
) as filename
from t
where restaurant_id in (24, 29, 154)
group by restaurant_id;
First look for the/a profile filename. Next just choose an arbitrary one.
i'm aware that this question has been asked multiple times and I have read those threads to get to where I am now, but those solutions don't seem to be working. I need to have a running total of my ExpectedAmount...
I have the following table:
+--------------+----------------+
| ExpectedDate | ExpectedAmount |
+--------------+----------------+
| 1 | 2485513 |
| 2 | 526032 |
| 3 | 342041 |
| 4 | 195807 |
| 5 | 380477 |
| 6 | 102233 |
| 7 | 539951 |
| 8 | 107145 |
| 10 | 165110 |
| 11 | 18795 |
| 12 | 27177 |
| 13 | 28232 |
| 14 | 154631 |
| 15 | 5566585 |
| 16 | 250814 |
| 17 | 90444 |
| 18 | 105424 |
| 19 | 62132 |
| 20 | 1799349 |
| 21 | 303131 |
| 22 | 459464 |
| 23 | 723488 |
| 24 | 676514 |
| 25 | 17311911 |
| 26 | 4876062 |
| 27 | 4844434 |
| 28 | 4039687 |
| 29 | 1418648 |
| 30 | 4366189 |
| 31 | 9028836 |
+--------------+----------------+
I have the following SQL:
SELECT a.ExpectedDate, a.ExpectedAmount, (SELECT SUM(b.ExpectedAmount)
FROM UnpaidManagement..Expected b
WHERE b.ExpectedDate <= a.ExpectedDate)
FROM UnpaidManagement..Expected a
The result of the above SQL is this:
+--------------+----------------+--------------+
| ExpectedDate | ExpectedAmount | RunningTotal |
+--------------+----------------+--------------+
| 1 | 2485513 | 2485513 |
| 2 | 526032 | 9480889 |
| 3 | 342041 | 46275618 |
| 4 | 195807 | 59866450 |
| 5 | 380477 | 60246927 |
| 6 | 102233 | 60349160 |
| 7 | 539951 | 60889111 |
| 8 | 107145 | 60996256 |
| 10 | 165110 | 2650623 |
| 11 | 18795 | 2669418 |
| 12 | 27177 | 2696595 |
| 13 | 28232 | 2724827 |
| 14 | 154631 | 2879458 |
| 15 | 5566585 | 8446043 |
| 16 | 250814 | 8696857 |
| 17 | 90444 | 8787301 |
| 18 | 105424 | 8892725 |
| 19 | 62132 | 8954857 |
| 20 | 1799349 | 11280238 |
| 21 | 303131 | 11583369 |
| 22 | 459464 | 12042833 |
| 23 | 723488 | 12766321 |
| 24 | 676514 | 13442835 |
| 25 | 17311911 | 30754746 |
| 26 | 4876062 | 35630808 |
| 27 | 4844434 | 40475242 |
| 28 | 4039687 | 44514929 |
| 29 | 1418648 | 45933577 |
| 30 | 4366189 | 50641807 |
| 31 | 9028836 | 59670643 |
+--------------+----------------+--------------+
You can tell from the first few values already that the math is all off, but then at some points the math adds up?! I'm Too confused !! Could someone please point me to another solution or to where I have gone wrong with this?
I am using SQL Server 2008.
It is because your ExpectedDate column of type varchar. Try this:
SELECT a.ExpectedDate, a.ExpectedAmount, (SELECT SUM(b.ExpectedAmount)
FROM UnpaidManagement..Expected b
WHERE CAST(b.ExpectedDate as int) <= CAST(a.ExpectedDate as int))
FROM UnpaidManagement..Expected a
Note that it could be inefficient query.
I think this gonna work:
SELECT a.ExpectedDate,
a.ExpectedAmount,
SUM(b.ExpectedAmount) RuningTotal
FROM UnpaidManagement..Expected a
LEFT JOIN UnpaidManagement..Expected b
ON CAST(b.ExpectedDate as int) <= CAST(a.ExpectedDate as int)
GROUP BY a.ExpectedDate, a.ExpectedAmount
AND:
SELECT a.ExpectedDate,
a.ExpectedAmount,
c.RuningTotal
FROM UnpaidManagement..Expected a
CROSS APPLY (
SELECT SUM(b.ExpectedAmount) AS RuningTotal
FROM UnpaidManagement..Expected b
WHERE CAST(b.ExpectedDate as int) <= CAST(a.ExpectedDate as int)
) c
I have a table where I have a list of people, lets say i have 100 people listed in that table
I need to filter out the people using different criteria's and put them in groups, problem is when i start excluding on the 4th-5th level, performance issues come up and it becomes slow
with lst_tous_movements as (
select
t1.refid_eClinibase
t1.[dthrfinmouvement]
t1.[unite_service_id]
t1.[unite_service_suiv_id]
from sometable t1
)
,lst_patients_hospitalisés as (
select distinct
t1.refid_eClinibase
from lst_tous_movements t1
where
t1.[dthrfinmouvement] = '4000-01-01'
)
,lst_patients_admisUIB_transferes as (
select distinct
t1.refid_eClinibase
from lst_tous_movements t1
left join lst_patients_hospitalisés t2 on t1.refid_eClinibase = t2.refid_eClinibase
where
t1.[unite_service_id] = 4
and t1.[unite_service_suiv_id] <> 0
and t2.refid_eClinibase is null
)
,lst_patients_admisUIB_nonTransferes as (
select distinct
t1.refid_eClinibase
from lst_tous_movements t1
left join lst_patients_admisUIB_transferes t2 on t1.refid_eClinibase = t2.refid_eClinibase
left join lst_patients_hospitalisés t3 on t1.refid_eClinibase = t3.refid_eClinibase
where
t1.[unite_service_id] = 4
and t1.[unite_service_suiv_id] = 0
and t2.refid_eClinibase is null
and t3.refid_eClinibase is null
)
,lst_patients_autres as (
select distinct
t1.refid_eClinibase
from lst_patients t1
left join lst_patients_admisUIB_transferes t2 on t1.refid_eClinibase = t2.refid_eClinibase
left join lst_patients_hospitalisés t3 on t1.refid_eClinibase = t3.refid_eClinibase
left join lst_patients_admisUIB_nonTransferes t4 on t1.refid_eClinibase = t4.refid_eClinibase
where
t2.refid_eClinibase is null
and t3.refid_eClinibase is null
and t4.refid_eClinibase is null
)
as you can see i have a multi level filtering out going on here...
1st i get the people where t1.[dthrfinmouvement] = '4000-01-01'
2nd i get the people with another criteria EXCLUDING the 1st group
3rd i get the people with yet another criteria EXCLUDING the 1st and
the 2nd group
etc..
when i get to the 4th level, my query takes 6 - 10 seconds to complete
is there any way to speed this up ?
this is my dataset i'm working with:
+------------------+-------------------------------+------------------+------------------+-----------------------+
| refid_eClinibase | nodossierpermanent_eClinibase | dthrfinmouvement | unite_service_id | unite_service_suiv_id |
+------------------+-------------------------------+------------------+------------------+-----------------------+
| 25611 | P0017379 | 2013-04-27 | 58 | 0 |
| 25611 | P0017379 | 2013-05-02 | 4 | 2 |
| 25611 | P0017379 | 2013-05-18 | 2 | 0 |
| 85886 | P0077918 | 2013-04-10 | 58 | 0 |
| 85886 | P0077918 | 2013-05-06 | 6 | 12 |
| 85886 | P0077918 | 4000-01-01 | 12 | 0 |
| 91312 | P0083352 | 2013-07-24 | 3 | 14 |
| 91312 | P0083352 | 2013-07-24 | 14 | 3 |
| 91312 | P0083352 | 2013-07-30 | 3 | 8 |
| 91312 | P0083352 | 4000-01-01 | 8 | 0 |
| 93835 | P0085879 | 2013-04-30 | 58 | 0 |
| 93835 | P0085879 | 2013-05-07 | 4 | 2 |
| 93835 | P0085879 | 2013-05-16 | 2 | 0 |
| 93835 | P0085879 | 2013-05-22 | 58 | 0 |
| 93835 | P0085879 | 2013-05-24 | 4 | 0 |
| 93835 | P0085879 | 2013-05-31 | 58 | 0 |
| 93836 | P0085880 | 2013-05-20 | 58 | 0 |
| 93836 | P0085880 | 2013-05-22 | 4 | 2 |
| 93836 | P0085880 | 2013-05-31 | 2 | 0 |
| 97509 | P0089576 | 2013-04-09 | 58 | 0 |
| 97509 | P0089576 | 2013-04-11 | 4 | 0 |
| 102787 | P0094886 | 2013-04-08 | 58 | 0 |
| 102787 | P0094886 | 2013-04-11 | 4 | 2 |
| 102787 | P0094886 | 2013-05-21 | 2 | 0 |
| 103029 | P0095128 | 2013-04-04 | 58 | 0 |
| 103029 | P0095128 | 2013-04-10 | 4 | 1 |
| 103029 | P0095128 | 2013-05-03 | 1 | 0 |
| 103813 | P0095922 | 2013-07-02 | 58 | 0 |
| 103813 | P0095922 | 2013-07-03 | 4 | 6 |
| 103813 | P0095922 | 2013-08-14 | 6 | 0 |
| 105106 | P0097215 | 2013-08-09 | 58 | 0 |
| 105106 | P0097215 | 2013-08-13 | 4 | 0 |
| 105106 | P0097215 | 2013-08-14 | 58 | 0 |
| 105106 | P0097215 | 4000-01-01 | 4 | 0 |
| 106223 | P0098332 | 2013-06-11 | 1 | 0 |
| 106223 | P0098332 | 2013-08-01 | 58 | 0 |
| 106223 | P0098332 | 4000-01-01 | 1 | 0 |
| 106245 | P0098354 | 2013-04-02 | 58 | 0 |
| 106245 | P0098354 | 2013-05-24 | 58 | 0 |
| 106245 | P0098354 | 2013-05-29 | 4 | 1 |
| 106245 | P0098354 | 2013-07-12 | 1 | 0 |
| 106280 | P0098389 | 2013-04-07 | 58 | 0 |
| 106280 | P0098389 | 2013-04-09 | 4 | 0 |
| 106416 | P0098525 | 2013-04-19 | 58 | 0 |
| 106416 | P0098525 | 2013-04-23 | 4 | 0 |
| 106444 | P0098553 | 2013-04-22 | 58 | 0 |
| 106444 | P0098553 | 2013-04-25 | 4 | 0 |
| 106609 | P0098718 | 2013-05-08 | 58 | 0 |
| 106609 | P0098718 | 2013-05-10 | 4 | 11 |
| 106609 | P0098718 | 2013-07-24 | 11 | 12 |
| 106609 | P0098718 | 4000-01-01 | 12 | 0 |
| 106616 | P0098725 | 2013-05-09 | 58 | 0 |
| 106616 | P0098725 | 2013-05-09 | 4 | 1 |
| 106616 | P0098725 | 2013-07-27 | 1 | 0 |
| 106698 | P0098807 | 2013-05-16 | 58 | 0 |
| 106698 | P0098807 | 2013-05-22 | 4 | 6 |
| 106698 | P0098807 | 2013-06-14 | 6 | 1 |
| 106698 | P0098807 | 2013-06-28 | 1 | 0 |
| 106714 | P0098823 | 2013-05-20 | 58 | 0 |
| 106714 | P0098823 | 2013-05-21 | 58 | 0 |
| 106714 | P0098823 | 2013-05-24 | 58 | 0 |
| 106729 | P0098838 | 2013-05-21 | 58 | 0 |
| 106729 | P0098838 | 2013-05-23 | 4 | 1 |
| 106729 | P0098838 | 2013-06-03 | 1 | 0 |
| 107038 | P0099147 | 2013-06-25 | 58 | 0 |
| 107038 | P0099147 | 2013-06-28 | 4 | 1 |
| 107038 | P0099147 | 2013-07-04 | 1 | 0 |
| 107038 | P0099147 | 2013-08-13 | 58 | 0 |
| 107038 | P0099147 | 2013-08-15 | 4 | 6 |
| 107038 | P0099147 | 4000-01-01 | 6 | 0 |
| 107082 | P0099191 | 2013-06-29 | 58 | 0 |
| 107082 | P0099191 | 2013-07-04 | 4 | 6 |
| 107082 | P0099191 | 2013-07-19 | 6 | 0 |
| 107157 | P0099267 | 4000-01-01 | 13 | 0 |
| 107336 | P0099446 | 4000-01-01 | 6 | 0 |
+------------------+-------------------------------+------------------+------------------+-----------------------+
thanks.
It is hard to understand exactly what all your rules are from the question, but the general approach should be to add a "Grouping" column to a singl query that uses a CASE statement to categorize the people.
The conditions in a CASE are evaluated in order, so that if the first criteria is met, then the subsequent criteria are not even evaluated for that row.
Here is some code to get you started....
select t1.refid_eClinibase
,t1.[dthrfinmouvement]
,t1.[unite_service_id]
,t1.[unite_service_suiv_id]
CASE WHEN [dthrfinmouvement] = '4000-01-01' THEN 'Group1 Label'
WHEN condition2 = something THEN 'Group2 Label'
....
WHEN conditionN = something THEN 'GroupN Label'
ELSE 'Catch All Label'
END as person_category
from sometable t1
I have the following MySQL table:
+---------+------------+------+--------+------+---------+------------+-------+---------+----------+------------+------------+
| Version | Yr_Varient | FY | Period | CoA | Company | Item | Mvt | Ptnr_Co | Investee | GC | LC |
+---------+------------+------+--------+------+---------+------------+-------+---------+----------+------------+------------+
| 201 | 1 | 2010 | 1 | 11 | 23 | 1110105000 | 60200 | | | 450000 | 450000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 2110300000 | 60200 | | | -520000 | -520000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 1220221600 | | | | 78080 | 78080 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 2130323000 | | | | 50000 | 50000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 2130322000 | | | | -58080 | -58080 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 3100505000 | | | | -275000 | -275000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 3200652500 | | | | 216920 | 216920 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 3900000000 | | | | 58080 | 58080 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 1110105000 | 60200 | | | 376000 | 376000 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 2110300000 | 60200 | | | -545000 | -545000 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 1220221600 | | | | 452250 | 452250 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 2130323000 | | | | -165000 | -165000 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 2130322000 | | | | -118250 | -118250 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 3100505000 | | | | -937750 | -937750 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 3200652500 | | | | 819500 | 819500 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 3900000000 | | | | 118250 | 118250 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 1110105000 | 60200 | | | 777000 | 777000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2110308000 | 60200 | 43 | | -255000 | -255000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2130321500 | | | | 180000 | 180000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2130322000 | | | | -77000 | -77000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2310407001 | | 1 | | -625000 | -625000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 3100505000 | | | | -2502500 | -2502500 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 3200652500 | | | | 2425500 | 2425500 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 3900000000 | | | | 77000 | 77000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1110105000 | 60200 | | | 2600000 | 2600000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1140161000 | 60200 | | 23 | 430000 | 430000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1140161000 | 60200 | | 26 | 505556 | 505556 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1140160000 | 60200 | 37 | | 255000 | 255000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1160163000 | 60200 | 99999 | 48 | 49428895 | 49428895 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1160163000 | 60200 | 99999 | 49 | 188260175 | 188260175 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2310405500 | | | | -237689070 | -237689070 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2110300000 | 60200 | | | -1000 | -1000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2110300500 | 60200 | | | -3999000 | -3999000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1220221600 | | | | 1571112 | 1571112 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2130321500 | | | | -805556 | -805556 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2130322000 | | | | -556112 | -556112 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3100505000 | | | | -836000 | -836000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3200652500 | | | | 781000 | 781000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3300715700 | | 99999 | 32 | -440000 | -440000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3300715700 | | 99999 | 26 | -61112 | -61112 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3900000000 | | | | 556112 | 556112 |
+---------+------------+------+--------+------+---------+------------+-------+---------+----------+------------+------------+
I need to take all rows with Mvt = 60200 and multiply every GC and LC record in that row by 1.1 and add a new row containing the changes back into the same table with FY set to 2011.
How can I do all this in 1 statement?
Is it even possible to do all this in 1 statement (I know very little about SQL)?
Can this be done in standard SQL as the database will be ported to another Database Server?
I don't know which server it will be.
In standard SQL (there may be better ways in vendor-specific implementations but I tend to prefer standard stuff where possible):
insert into mytable (
Version, Yr_Varient, Period, CoA, Company, Item, Mvt, Ptnr_Co, Investee,
FY, GC, LC
) select
Version, Yr_Varient, Period, CoA, Company, Item, Mvt, Ptnr_Co, Investee,
2011, GC*1.1, LC*1.1
from mytable
where Mvt = 60200
-- and FY = 2010
You may also want to limit your select statement a little more depending on the results of your testing, such as uncommenting the and FY = 2010 line above to stop copying all your 2009 and 2008 data as well, if any. I asume you only wanted to carry forward the previous year's stuff with a 10% increase on GC and LC.
The way this works is to run the select which gives modified data for FY, GC and LC as per your request, and pump all those rows back into the insert.
insert into mytable (
Version,Yr_Varient,FY,Period,CoA,Company,Item,Mvt,Ptnr_Co,Investee,GC,LC)
SELECT Version ,Yr_Varient,"2011" as FY, Period, CoA, Company , Item , Mvt ,Ptnr_Co , Investee , GC*1.1 as GC, LC*1.1 as LC FROM <table Name>
WHERE Mvt = 60200
INSERT INTO _table_
(Version,
Yr_Varient,
FY,
Period,
CoA,
Company,
Item,
Mvt,
Ptnr_Co,
Investee,
GC,
LC)
SELECT
Version,
Yr_Varient,
2011,
Period,
CoA,
Company,
Item,
Mvt,
Ptnr_Co,
Investee,
GC * 1.1,
LC * 1.1
FROM
_table_
WHERE
Mvt = 60200
AND FY <> 2011
This statement should work in any SQL-Database.
Edit: Too slow