selecting a distinct combination of 2 columns in SQL - sql

When i run a select after a number of joins on my table I have an output of 2 columns and I want to select a distinct combination of col1 and col2 for the rowset returned.
the query that i run will be smthing like this:
select a.Col1,b.Col2 from a inner join b on b.Col4=a.Col3
now the output will be somewhat like this
Col1 Col2
1 z
2 z
2 x
2 y
3 x
3 x
3 y
4 a
4 b
5 b
5 b
6 c
6 c
6 d
now I want the output should be something like follows
1 z
2 y
3 x
4 a
5 b
6 d
its ok if I pick the second column randomly as my query output is like a million rows and I really dnt think there will be a case where I will get Col1 and Col2 output to be same even if that is the case I can edit the value..
Can you please help me with the same.. I think basically the col3 needs to be a row number i guess and then i need to selct two cols bases on a random row number.. I dont know how do i transalte this to SQL
consider the case 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e now group by will give me all these results where as i want 1a and 2d or 1a and 2b. any such combination.
OK let me explain what im expecting:
with rs as(
select a.Col1,b.Col2,rownumber() as rowNumber from a inner join b on b.Col4=a.Col3)
select rs.Col1,rs.Col2 from rs where rs.rowNumber=Round( Rand() *100)
now I am not sure how do i get the rownumber or the random working correctly!!
Thanks in advance.

If you simply don't care what col2 value is returned
select a.Col1,MAX(b.Col2) AS Col2
from a inner join b on b.Col4=a.Col3
GROUP BY a.Col1
If you do want a random value you could use the approach below.
;WITH T
AS (SELECT a.Col1,
b.Col2
ROW_NUMBER() OVER (PARTITION BY a.Col1 ORDER BY (SELECT NEWID())
) AS RN
FROM a
INNER JOIN b
ON b.Col4 = a.Col3)
SELECT Col1,
Col2
FROM T
WHERE RN = 1
Or alternatively use a CLR Aggregate function. This approach has the advantage that it eliminates the requirement to sort by partition, newid() an example implementation is below.
using System;
using System.Data.SqlTypes;
using System.IO;
using System.Security.Cryptography;
using Microsoft.SqlServer.Server;
[Serializable]
[SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize = 8000)]
public struct Random : IBinarySerialize
{
private MaxSoFar _maxSoFar;
public void Init()
{
}
public void Accumulate(SqlString value)
{
int rnd = GetRandom();
if (!_maxSoFar.Initialised || (rnd > _maxSoFar.Rand))
_maxSoFar = new MaxSoFar(value, rnd) {Rand = rnd, Value = value};
}
public void Merge(Random group)
{
if (_maxSoFar.Rand > group._maxSoFar.Rand)
{
_maxSoFar = group._maxSoFar;
}
}
private static int GetRandom()
{
var buffer = new byte[4];
new RNGCryptoServiceProvider().GetBytes(buffer);
return BitConverter.ToInt32(buffer, 0);
}
public SqlString Terminate()
{
return _maxSoFar.Value;
}
#region Nested type: MaxSoFar
private struct MaxSoFar
{
private SqlString _value;
public MaxSoFar(SqlString value, int rand) : this()
{
Value = value;
Rand = rand;
Initialised = true;
}
public SqlString Value
{
get { return _value; }
set
{
_value = value;
IsNull = value.IsNull;
}
}
public int Rand { get; set; }
public bool Initialised { get; set; }
public bool IsNull { get; set; }
}
#endregion
#region IBinarySerialize Members
public void Read(BinaryReader r)
{
_maxSoFar.Rand = r.ReadInt32();
_maxSoFar.Initialised = r.ReadBoolean();
_maxSoFar.IsNull = r.ReadBoolean();
if (_maxSoFar.Initialised && !_maxSoFar.IsNull)
_maxSoFar.Value = r.ReadString();
}
public void Write(BinaryWriter w)
{
w.Write(_maxSoFar.Rand);
w.Write(_maxSoFar.Initialised);
w.Write(_maxSoFar.IsNull);
if (!_maxSoFar.IsNull)
w.Write(_maxSoFar.Value.Value);
}
#endregion
}

You need to group by a.Col1 to get distinct by only a.Col1, then since b.Col2 is not included in the group you need to find a suitable aggregate function to reduce all values in the group to just one, MIN is good enough if you just want one of the values.
select a.Col1, MIN(b.Col2) as c2
from a
inner join b on b.Col4=a.Col3
group by a.Col1

If I understand you correctly, you want to have one line for each combination in column 1 and 2. That can easily be done by using GROUP BY or DISTINCT
for instance:
SELECT col1, col2
FROM Your Join
GROUP BY col1, col2

You must use a group by clause :
select a.Col1,b.Col2
from a
inner join b on b.Col4=a.Col3
group by a.Col1

Related

Understanding MVC SQL query. From, where, then select

I inherited the code below and was hoping someone could explain it to me. The model definition is fine, but I am not understanding the string Minor1 setup. The SQL query looks backwards. Is this a normal practice? I tried putting the query in the normal order (select, from, where) and it did not work.
Example for model:
[Table("MINORS")]
public class VerifyDataModel_MINORS
{
[Key]
public string MINORS_ID { get; set; }
public string MINORS_DESC { get; set; }
}
string Minor1 = (from A in db.VerifyDataModel_STUDENT_PROGRAM
join B in db.VerifyDataModel_STPR_STATUS on A.STUDENT_PROGRAMS_ID equals B.STUDENT_PROGRAMS_ID into AB
from AB1 in AB.Where(B => B.POS == 1).DefaultIfEmpty()
join C in db.VerifyDataModel_STPR_MINOR_LISTS on A.STUDENT_PROGRAMS_ID equals C.STUDENT_PROGRAMS_ID into ABC
from ABC1 in ABC.Where(C => C.STPR_MINORS.Substring(0, 2) != "XX").DefaultIfEmpty()
join D in db.VerifyDataModel_MINOR on ABC1.STPR_MINORS equals D.MINORS_ID into ABCD
from ABCD1 in ABCD.DefaultIfEmpty()
where AB1.STPR_STATUS == "A"
where ABC1.STUDENT_PROGRAMS_ID.Substring(8, 5) != "GENED"
where A.STUDENT_PROGRAMS_ID.Substring(0, 7) == personId
where ABC1.STPR_MINOR_END_DATE == null
select ABCD1.MINORS_DESC).Min();

How to join linq in c#?

How to convert sql to system.linq?
Select top 100 percent s.a,s.b,s.c,s.d
From table a as s, table b as x
Where
s.a=x.a and s.b=x.b and s.c=x.c
Group by
s.a,s.b,s.c,s.d
As per my understanding of your question; seems like you want to fetch the data in c# and do joining. if so, then you may do as following:
public class tabData
{
public string a {get;set;}
public string b {get;set;}
public string c {get;set;}
public string d {get;set;}
}
List<tabData> tabA = {data of your table a}
List<tabData> tabB = {data of your table b}
var result = from r1 in tabA
join r2 in tabB on new {T1 = r1.a, T2 = r1.b, T3 = r1.c} equals new {T1 = r2.a, T2 = r2.b, T3 = r2.c}
group r1 by new
{
aa = r1.a,
bb = r1.b,
cc = r1.c,
dd = r1.d
} into g
select new
{
a = g.key.aa,
b = g.key.bb,
c = g.key.cc,
d = g.key.dd
}
I think you are asking how to join in linq as you would in sql, if so, please see below:
var query =
from abc in tbl1
join def in tbl2 on tbl1.PK equals tbl2.FK
select new { ABC = abc, DEF = def };

Find missing element SQL

I have a couple of tables of a database where one defines a set of matrices and the other the data in the matrices
Matrices
Id Name
1 M1
2 M2
3 M3
4 M4
MatrixElements
Matrix_Id ElementKey Value
1 1 234
1 2 234
1 3 4432
2 1 234
2 2 13
2 3 123
3 1 34
3 3 345
4 1 234
4 2 11
4 3 344
So the Matrix_Id column is a foreign key back to the Id of the Matrices table. The ElementKey reprents an ij pair. The matrices are sparse, so there may or may not be an element with a specific key. However, if one matrix has a particular ElementKey, then the ElementKey with that ID must be defined in ALL matrices.
Is there some SQL that I can run that will find Matrix_Id and ElementKey combinations for any offending entries, i.e. one that is not defined for all matrices? So for the example above, ElementKey = 2 is defined for Matrix 1, 2 and 4 but not 3, so I would expect [Matrix = 3, ElementKey = 2] back.
This will get the missing elements and the matrices they are in:
select m.id, me.element_key
from matrices m cross join
(select distinct element_key from matrix_elements me) e left join
matrix_elements me
on me.matrix_id = m.id and me.element_key = e.element_key
where me.matrix_id is null;
The cross join generates all combinations of matrices with known element keys. The left join and where then find the ones that are missing.
First we make a list of all live elements, then cross join to all active matrices. With this list, we left join the active elements and use a case to determine if they exist.
I've used ANSI, but a CTE would be better if SQL server or Oracle.
select c.id, c.ElementKey, case when b.MatrixID is null then 0 else 1 end as InPlace
from
(
select id, a.ElementKey
from Matrices
cross join
(
select distinct ElementKey
from MatrixElements
) a
) c
left join Matrices b
on b.Matrix_id = c.id
and b.ElementKey = c.ElementKey
Thanks to the responses I have had. I haven't been able to implement it effectively because I am using Entity Framework and it is hard to translate the code given to a query statement that will give me back the results. I did take inspiration from the samples given and this is what I came up with.
public class Matrix
{
[Key]
public int Id { get; set; }
public virtual List<MatrixElement> Data { get; set; }
}
public class MatrixElement
{
[Key]
public int Id { get; set; }
public int OdPairKey { get; set; }
public double Value { get; set; }
}
public class EngineDataContext : DbContext
{
public virtual DbSet<Matrix> MatrixFiles { get; set; }
public virtual DbSet<MatrixElement> MatrixData { get; set; }
}
public class SqliteRepository
{
private readonly EngineDataContext _dataContext;
public SqliteRepository(EngineDataContext dataContext)
{
_dataContext = dataContext;
}
public IEnumerable<Tuple<Matrix, int>> FindMissingODPairs(IEnumerable<Matrix> matrices)
{
IEnumerable<Matrix> matricesWithData =
matrices.Select(m => _dataContext.MatrixFiles
.Include("Data").First(mf => mf.Id == m.Id));
// Do the cross join on matrices and OD pairs
IEnumerable<dynamic> combinations =
from m in matrices
from od in matricesWithData.SelectMany(mat => mat.Data.Select(md => md.OdPairKey)).Distinct()
select new { M = m.Id, O = od };
// Find all the used matrix / OD pair combinations
IEnumerable<dynamic> used =
from m in matricesWithData
from od in m.Data
select new { M = m.Id, O = od.OdPairKey };
// Find the missing combinations with a simple "Except" query
return combinations
.Except(used)
.Select(c => new Tuple<Matrix, int>(matrices.First(m => m.Id == c.M), c.O));
}
}

Empty result set when Joining two table with Non-match Foreign key

I am joining two tables using a foreign key. TABLE_1 might have a row with a null for the foreign key. Which means that when I join the two tables based on that foreign key I won't get results for it. My problem is that when I use JOIN two tables in LINQ, I get an empty result set.
I want to be able to get the row in TABLE_1 even if the JOIN result with no match with TABLE_2.
I tried to use DefaultIfEmpty in the join of TABLE_2, but I still get an empty result set. How can I join two tables and still get a result even if TABLE_1 contains a null in the foreign key which I use to JOIN the two table?
Thanks
Hi try left join from Table2 to Table1
class Program
{
static void Main(string[] args)
{
List<Table1> Table_1 = new List<Table1>();
Table_1.Add(new Table1() { Id = 1, Name = "Lion" });
Table_1.Add(new Table1() { Id = 2, Name = "Elephant" });
List<Table2> Table_2 = new List<Table2>();
Table_2.Add(new Table2() { Id = 1, Class = "Carnivorous" });
Table_2.Add(new Table2() { Id = 2, Class = "Herbivorous" });
Table_2.Add(new Table2() { Id = 3, Class = "Mammal" });
Table_2.Add(new Table2() { Id = 4, Class = "Aquarious" });
var result = (from a in Table_2
join b in Table_1
on a.Id equals b.Id into leftJoin
from c in leftJoin.DefaultIfEmpty()
select new { Id = a.Id, Name = c == null ? string.Empty : c.Name, Class = a.Class }
).ToList();
var abc = result;
}
}
public class Table1
{
public int Id;
public string Name;
}
public class Table2
{
public int Id;
public string Class;
}
Try RIGHT JOIN if LEFT JOIN doesn't work.

Need Linq equivalent

Help in find the Linq equivalent on the below sql query:
select sum(weight) from (
select weight from pers p
join list l on l.persindex= p.persindex
group by p.persindex,weight) a
from p in context.pers
join l in context.list on l.persindex equals p.persindex
group by new
{ p.persindex,
l.weight
} into myGroup
select new()
{ Key = myGroup.Key,
GroupSum = myGroup.sum(x=>x.weight)
}
I guess that's what you need:
public int CalcWeight(IEnumerable<Person> pers, IEnumerable<Person> list)
{
return
pers
.Join(list, p=>p.PersIndex, l=>l.PersIndex, (p, l) => new {p.PersIndex, l.Weight})
.GroupBy( a => new {a.PersIndex, a.Weight})
.Sum(group=>group.Key.Weight);
}
Data class Person is decalerd like this:
public class Person
{
public int PersIndex;
public int Weight;
}
VB.NET version
Dim customerList = From p In pers
Group Join l In list On
p.persindex Equals l.persindex
Into joineddata = Group,
TotalOweight = Sum(p.weight)
Select p.persindex, TotalOweight
Try : Linquer SQL to LINQ convertion tool :
from p in pers
joins l in list on l.persindex equals p.persindex
group by new {p.persindex,l.weight } into grp
select new { sum = grp.sum(x=>x.weight)}