Group records from a set by a common value - sql

I have a problem with groping a set of records by a common value. A common value isn't one single value thought therefore I'm not sure how to approach it
This is a set / table:
This is what I'd need to achieve:
Is it too tricky or it could be achieved (SQL 2012). If it could please point me into a direction coz I'm hit a bit wall :D Thanks!

Use disctinct form when you get group data from specific column
SELECT DISTINCT SSC FROM Table_name;

Related

SQL Query is creating way too many repeated rows

i have an issue with a sql query and how the output is being displayed, you see, i have 3 tables and have at least one field in common, the thing is when i join 2 tables together the information i need is displayed properly, but when i join the third the output goes insane and duplicates the results way too much and i need to figure out why it is happening, down below i'll show you all the tables and relations between each other
this is how the tables are related to each other
This is how the first table (dbo_predios) is made the first three fields are the only relevant in this case
This is how the second table (dbo_permisos_obras_mayores) is made the first three fields are the only relevant in this case as well, the second two can match the first table (dbo_predios)
And here is how the third table (dbo_recepciones_obras_mayores) is made, the fourth field is the only relevant in this case, it could relate to the second table (dbo_permisos_obras_mayores) to the same name field
okay, now that is structurewise, now the query i'm executing is the following:
SELECT
dbo_predios.codigo_unico_predio,
dbo_permisos_obras_mayores.numero_permiso_edificacion,
dbo_permisos_obras_mayores.fecha_permiso_edificacion
FROM dbo_predios
INNER JOIN dbo_permisos_obras_mayores ON dbo_predios.codigo_manzana_predio = dbo_permisos_obras_mayores.codigo_manzana_predio AND dbo_predios.codigo_lote_predio = dbo_permisos_obras_mayores.codigo_lote_predio
INNER JOIN dbo_recepciones_obras_mayores ON dbo_permisos_obras_mayores.numero_recepcion_permiso = dbo_recepciones_obras_mayores.numero_recepcion_permiso
WHERE dbo_permisos_obras_mayores.codigo_manzana_predio = 9402 AND dbo_permisos_obras_mayores.codigo_lote_predio = 30
And the result of executing the query in that way is this:
Later on i did some trial and error and removed the second inner join line, and the result surprised me, here is what happened:
Conclusion: in brief the third table is causing the cartesian product, why? i wish i knew why, what do you think of this particular case? i'd thank any help you could give me, thanks in advance.
Here's the solution - since you are saying that the numero_recepcion_permiso is blank, just add the condition to the inner join, to exclude empty ones:
SELECT
dbo_predios.codigo_unico_predio,
dbo_permisos_obras_mayores.numero_permiso_edificacion,
dbo_permisos_obras_mayores.fecha_permiso_edificacion
FROM dbo_predios
INNER JOIN dbo_permisos_obras_mayores ON dbo_predios.codigo_manzana_predio = dbo_permisos_obras_mayores.codigo_manzana_predio AND dbo_predios.codigo_lote_predio = dbo_permisos_obras_mayores.codigo_lote_predio
INNER JOIN dbo_recepciones_obras_mayores ON dbo_permisos_obras_mayores.numero_recepcion_permiso = dbo_recepciones_obras_mayores.numero_recepcion_permiso
AND dbo_recepciones_obras_mayores.numero_recepcion_permiso <>''
WHERE dbo_permisos_obras_mayores.codigo_manzana_predio = 9402 AND dbo_permisos_obras_mayores.codigo_lote_predio = 30
With that said, should that field allowed to be blank or NULL? Perhaps you need to add a constraint to your table to prevent that scenario. Another suggestion - why did you choose NUMERIC(18,0) as the data type on the primary key for those tables? I would prefer a simple INT or BIGINT and maybe let the database generate the sequence for me.
Okay, i did what Icarus told me and i figured out something that is useful, you see, i made a big mistake and the number combination i was trying out didn't have a numero_recepcion_permiso so the output column is completely blank, however when there is an actual numero_recepcion_permiso it shows correctly, anyway i still need that doesn't output that much amount of repeated rows, how can i fix that? thank y'all for your help so far
First of all, make sure that both values exist in both fields and they actually match or else could generate that amount of repeated rows, however the amount of rows repeated is something i can't tell since i don't know what your actual data is, but that may clear up a Little bit that issue

Bigquery - remove duplicates of certain columns, but not all

I have two tables I am left joining together. The first tables has transnational level detail, causing the key I join to the second table to duplicate. When I left join the second table, the measure "company_spend" is highly inflated.
I need a way to keep only a single value of the duplicated data, and my thought was to run a distinct function on only those columns, but I am not seeing that Bigquery supports distinct functions on only a few columns, but not all.
SELECT UPPER(cwnextt.Current_Contract_Number) AS Current_Contract_Number,
UPPER(cwnextt.Replacement_Contract_Number) AS Replacement_Contract_Number,
UPPER(cwnextt.Current_Contract_Name) AS Current_Contract_Name,
UPPER(cwnextt.Supplier_Top_Parent_Entity_Code) AS Supplier_Top_Parent_Entity_Code,
UPPER(cwnextt.Supplier_Top_Parent_Name) AS Supplier_Top_Parent_Name,
UPPER(cwnextt.company_Entity_Code) AS company_Entity_Code,
UPPER(cwnextt.Facility_Name) AS Facility_Name,
smart.company_Spend AS companySpend
FROM `test_etl_field.contracts_with_member_entity_codes_test_view_2` cwnextt
--this table is what is causing the below table to duplicate,
--but I need all of this data AS well in its current format.
LEFT JOIN `test.trans_analysis` tsa
ON TRIM(UPPER(cwnextt.company_entity_code)) = TRIM(UPPER(tsa.company_entity_code))
AND TRIM(UPPER(cwnextt.Supplier_Top_Parent_Entity_Code)) = TRIM(UPPER(tsa.manufacturer_top_parent_entity_code))
AND TRIM(UPPER(cwnextt.Current_Contract_Name)) = TRIM(UPPER(tsa.contract_category))
AND cwnextt.spend_period_yyyyqmm = tsa.spend_period_yyyyqmm
--this table contains "company_spend" which is now duplicated
LEFT JOIN `test_etl_field.ecr_smart_data` smart
ON smart.company_entity_code = cwnextt.company_entity_code
AND (smart.contract_number = cwnextt.current_contract_number
OR smart.contract_number = cwnextt.replacement_contract_number)
AND smart.month_key = cwnextt.spend_period_yyyyqmm
If something can be created that will keep company_spend from duplicating on the second left join, that is what I am after.
Not sure to understand all the details of your problem but here's a fact from BigQuery doc :
SELECT DISTINCT
A SELECT DISTINCT statement discards duplicate rows
and returns only the remaining rows.
You can't apply DISTINCT on specific columns because it doesn't make sense. Let's say you have 4 columns and call DISTINCT on 3 columns, what is SQL supposed to do with the last one ?
You must tell SQL which value to keep for the remaining column and GROUP BY is the right solution here.
So if you want to:
Remove a column that has been duplicated : Just adjust your SELECT to get only the columns you want
Remove lines that have the same value in specific columns : I would suggest a GROUP BY on the targeted column and taking the aggregation you want (first, avg, sum or whatever) for the remaining ones.
Remove the value from a row if another row has the same : You may not want to do that. A row has to keep its value and you won't get it back. Besides, same problem, which row do you want to keep ?
Hope this helps ! Feel free to give clarification on your problem if you want more specific answers.
While I couldn't resolve this issue in SQL, I used Tableau via a FIXED LOD to aggregate the data passed duplicates so the end user could visualize the output with accuracy. Not ideal, but the SQL route wasn't make sense.

I use name data to result code,Sql update question

Update x1 a set a.dept_cd=(select distinct dept_cd from x2 b a.nm=b.nm)
It's my sql
Distinct make data unique, but it result in an error message,
row subquery returns more than one row
My data is string
So i use name to return code(dept_cd)
Can you help me?
If this query return that error, it means that you have more than one dept_cd where nm is equal to the one you are looking for.
The goal of distinct is to avoid having twice the same value of dept_cd.
If you need one the first one no matter what the value is, you can add limit 0,1 ad the end of your subquery.
If the value you need is a specific one, you need to find a way to update your query to isolate it but without having the full context, we cannot help you on that.

SQL to Spotfire query filtering issue with multiple tables

I am trying to calculate hours flowing in and out of a cost center. When the cost center lends out an employee for an hour it's +1 and when they borrow an employee for an hour it's -1.
Right now I'm using a query that says
select
columns
from dbo.table
where EmployeeCostCenter <> ProjectCostCenter
So when ProjectCostCenter = ID_CostCenter it returns +HoursQuantity.
Then I update ID_CostCenter = EmployeeCostCenter then where ID_CostCenter = EmployeeCostCenter to take -HoursQuantity.
That works fine. The problem is when I import it to Spotfire I can't filter on the main table even after I added the table relations. Can anyone explain why?
I can upload the actual code if needed, but I use 4 queries and a couple of them are quite lengthy. The main table, a temp table to calculate incoming hours, and a temp table to calculate outgoing hours are the only ones involved in this problem I think.
(moved to answer to avoid lengthy discussion)
Essentially, data relations are used to populate filtering / marking between different data-sets. Just like in RDBMS, the relation is what Spotfire uses as the link between dataset. Essentially it's the same as the column or columns you join on. Thus, any column that you wish to filter in TableA and have the result set limited in TableB (or visa versa) must be a relation.
Column matches aren't related columns, but are associated for aggregations, category axis, etc within each visualization. So if TableA has "amount" and TableB has "amount debit" and you wanted to use both of these in an expression, say Sum([TableA].[amount],[TableB].[amount debit]), they would need to be matched in order to not produce erroneous results.
Lastly, once you set up your relations, you should check your filter panel to set up how you want the filtering to work. You can have the rows included, excluded, or ignored all together. Here is a link explaining that.

How is it possible to see a column name from a different table within a subquery from different table?

I was practicing a subqueries in sql and all of a sudden i jumped into an unsual query which i never thought of could happen.
The question of my query is....
Write a query to display the average rate of Australian dollar,where the currency rate date is July 1 2005??
And the query was...
USE AdventureWorks2012
SELECT AverageRate FROM Sales.CurrencyRate
WHERE ToCurrencyCode='AUD' AND CurrencyRateDate IN
(SELECT CurrencyRateDate FROM Sales.Currency
WHERE CurrencyRateDate='2005-07-01')
So,my question is how is it possible to get the column name "CurrencyRateDate" in the sub query when it is actually from the table "CurrencyRate"??
I know my query is not in the correct format as it should be.
I'm extremely sorry if my title doesn't make sense.If you guys can give any better please change it..
Thanks
AND CurrencyRateDate IN
(SELECT CurrencyRateDate FROM Sales.Currency
WHERE CurrencyRateDate='2005-07-01')
All the CurrencyRateDate references here point to the column from the outer query.
So for each row in the outer query, you are getting a list consisting of only that row's CurrencyRateDate, repeated once for every row in the Sales.Currency table (if the CurrencyRateDate of that row is 2005-07-01, otherwise the list is empty).
Then you check whether the outer CurrencyRateDate value is in that list. Which it is, if and only if it's equal to 2005-07-01 (assuming there is at least one row in Sales.Currency).
So your query is equivalent to:
SELECT * FROM Sales.CurrencyRate
WHERE ToCurrencyCode='AUD' AND CurrencyRateDate='2005-07-01'