SQLite remove rows two column crossly equal - sql

I'm developing an android chat app. I'm using Room (SQLite persistence library) for storing data.
I have two tables: Friends and Messages. I want to display the latest message for every conversation.
Friends table:
data class FriendsModel(
#PrimaryKey
#SerialName("friendID") var friendID: String,
#SerialName("friendName") var friendName: String?,
#SerialName("friendPublicKey") var friendPublicKey: String?,
#SerialName("sharedSecretKey") var sharedSecretKey: String?
)
Messages table:
data class MessageModel(
#PrimaryKey(autoGenerate = true)
#SerialName("id") var _id: Int = 0,
#SerialName("from") var from: String,
#SerialName("to") var to :String,
#SerialName("message") var message: String?,
#SerialName("date") var date: String?
)
Fake data to give an example:
Notice User1 sends 3 messages to User2, as well as User2 sends 3 messages to User1. I want to get every latest message on every conversation. It doesn't matter who send the last message ('from' or 'to' fields)
Return should be something like this:
I tried to sort by date and group by both from and to. But It gives me last message sent by User1 to User2 and by User2 to User1 at the same time. I want to eliminate this case.

Related

Group by the day of the month - Kotlin Logic Problem

I'm stuck with the logic. So here it is, I have one model class Note:
data class Note(
val id: Int,
val title: String,
val description: String,
val date: Long = System.currentTimeMillis()
)
I have a list of multiple notes in my app List<Note>. And I need a way to convert that list into a Map. Where key will be the date: Long, and the value will be List<Note>. So: Map<Long, List<Note>> . I need to group those notes by the day of the month. For example, if multiple notes were created on October 31th, then they should be grouped in a single list of Notes, within a Map.
I'm really not sure how can I achieve that. Always had troubles with those date values. I will appreciate any help. :)
You can add a helper property to get the date in LocalDate format, which would make it easy to sort by day. If you were using this a lot, repeatedly, you might consider adding it as a member property that isn't computed on each retrieval (but not in the constructor because it is computed from another property that participates in equals and hashcode).
val Note.localDate: LocalDate
get() = Instant.ofEpochMilli(date).atZone(ZoneId.systemDefault()).toLocalDate()
Then you can use groupBy to create your Map of dates to lists.
val notesByLocalDate = notes.groupBy(Note::localDate) // or { it.localDate }
This is going to be one of the "just because you can, doesn't mean you should".
.groupBy {
val noteCalendar = Calendar.getInstance()
noteCalendar.timeInMillis = it.date
val day = noteCalendar.get(Calendar.DAY_OF_MONTH)
val month = noteCalendar.get(Calendar.MONTH)
val year = noteCalendar.get(Calendar.YEAR)
val dayCalendar = Calendar.getInstance()
dayCalendar.timeInMillis = 0L
dayCalendar.set(Calendar.DAY_OF_MONTH, day)
dayCalendar.set(Calendar.MONTH, month)
dayCalendar.set(Calendar.YEAR, year)
dayCalendar.set(Calendar.HOUR_OF_DAY, 12)
dayCalendar.timeInMillis
}
Trying to group something by its date of creation in milliseconds will result in no grouping because nothing is created at the exact same time*. So the only way for you to group it is to translate the date range of which those things happen into one value. In this case noon of that day to avoid timezone problems.
...but again I would recommend not grouping this by Long.

SFDC - Query all contacts shared with a given user?

I'm somewhat of a SFDC novice when it comes to integration, but is there any way I can query just the contacts shared with a given user, taking into account all the ways the share can occur? Essentially just see the same contacts the user would see in within the platform?
I think this is what you are looking for. I added some inline comments to explain what each step is doing. The end result should be all the contacts that can be read by a specified user in your org.
// add a set with all the contact ids in your org
List<contact> contacts = new List<contact>([Select id from Contact]);
Set<ID> contactids = new Set<ID>();
for(Contact c : contacts)
contactids.add(c.id);
// using the user record access you can query all the recordsids and the level of access for a specified user
List<UserRecordAccess> ura = new List<UserRecordAccess>([SELECT RecordId, HasReadAccess, HasTransferAccess, MaxAccessLevel
FROM UserRecordAccess
WHERE UserId = 'theuserid'
AND RecordId in: contactids
] );
// unfortunatelly you cannot agregate your query on hasReadAccess=true so you'd need to add this step
Set<id> readaccessID = new Set<ID>();
for(UserRecordAccess ur : ura)
{
if(ur.HasReadAccess==true)
{
readaccessID.add(ur.RecordID);
}
}
// This is the list of all the Contacts that can be read by the specified user
List<Contact> readAccessContact = new List<Contact>([Select id, name from contact where id in: readaccessID]);
// show the results
system.debug( readAccessContact);

PIG FILTER relation with next row the same relation

i'm searching for a long time now to solve my problem but nearly found nothing helpful.
Hopefully some of you can give me a tip.
I have a relation A with the following format: username, timestamp, ip
For example:
Harald 2014-02-18T16:14:49.503Z 123.123.123.123
Harald 2014-02-18T16:14:51.503Z 123.123.123.123
Harald 2014-02-18T16:14:55.503Z 321.321.321.321
And i want to find out, who changed his ip adress in less then 5 seconds. So the second and the third row should be interesting.
I want do group the relation by username und want to compare the timestamp of the actuall row with the next row. if the ip adress isnt the same and the timestamp is less then 5 seconds bigger, this should be at the output.
could someone help me with that issue?
regards.
first i want to thank you for your time.
but i actually stuck at the Sessionize part.
this is my data comming in:
aoebcu 2014-02-19T14:23:17.503Z 220.61.65.25
aoebcu 2014-02-19T14:23:14.503Z 222.117.144.19
aoebcu 2014-02-19T14:23:14.503Z 222.117.144.19
jekgru 2014-02-19T14:23:14.503Z 213.56.157.109
zmembx 2014-02-19T14:23:12.503Z 199.188.198.91
qhixcg 2014-02-19T14:23:11.503Z 203.40.104.119
and my code till now looks like this:
hijack_Reduced = FOREACH finalLogs GENERATE ClientUserName, timestamp, OriginalClientIP;
hijack_Filtered = FILTER hijack_Reduced BY OriginalClientIP != '-';
hijack_Sessionized = FOREACH (GROUP hijack_Filtered BY ClientUserName) {
views = ORDER hijack_Filtered BY timestamp;
GENERATE FLATTEN(Sessionize(views)) AS (ClientUserName,timestamp,OriginalClientIP,session_id);
}
but when i run this script, i got the following error Message:
15:36:22 ERROR -
org.apache.pig.tools.pigstats.SimplePigStats.setBackendException(542)
| ERROR 0: Exception while executing [POUserFunc (Name:
POUserFunc(datafu.pig.sessions.Sessionize)[bag] - scope-199 Operator
Key: scope-199) children: null at []]:
java.lang.IllegalArgumentException: Invalid format: "aoebcu"
i already tried a lot, but nothing worked.
do you got an idea?
Regards
While you could write a UDF for this, you can actually make use of the UDFs already available in Apache DataFu to solve this.
My solution involves applying sessionization to the data. Basically you look at consecutive events and assign each event a session ID. If the time elapsed between two events exceeds a specified amount of time, in your case 5 seconds, then the next event gets a new session ID. Otherwise consecutive events get the same session ID. Once each event is assigned its session ID the rest is easy. We group by session ID and look for sessions that have more than one distinct IP address.
I'll walk through my solution.
Suppose you have the following input data. Both Harold and Kumar change their IP addresses. But Harold does it within 5 seconds, while Kumar does not. So the output of our script should just be simply "Harold".
Harold,2014-02-18T16:14:49.503Z,123.123.123.123
Harold,2014-02-18T16:14:51.503Z,123.123.123.123
Harold,2014-02-18T16:14:55.503Z,321.321.321.321
Kumar,2014-02-18T16:14:49.503Z,123.123.123.123
Kumar,2014-02-18T16:14:55.503Z,123.123.123.123
Kumar,2014-02-18T16:15:05.503Z,321.321.321.321
Load the data
data = LOAD 'input' using PigStorage(',')
AS (user:chararray,time:chararray,ip:chararray);
Now define a couple UDFs from DataFu. The Sessionize UDF performs sessionization as I described earlier. The DistinctBy UDF will be used to find the distinct IP addresses within each session.
define Sessionize datafu.pig.sessions.Sessionize('5s');
define DistinctBy datafu.pig.bags.DistinctBy('1');
Group the data by user, sort by time, and apply the Sessonize UDF. Note that the timestamp must be the first field, as this is what Sessionize expects. This UDF appends a session ID to each tuple.
data = FOREACH data GENERATE time,user,ip;
data_sessionized = FOREACH (GROUP data BY user) {
views = ORDER data BY time;
GENERATE flatten(Sessionize(views)) as (time,user,ip,session_id);
}
Now that the data is sessionized, we can group by the user and session. I group by user too because I want to spit this value back out. We pass the bag of events into the DistinctBy UDF. Check the documentation of this UDF for a more detailed description. But essentially we will get as many tuples as there are distinct IP addresses per session. Note that I have removed the time from the relation below. This is because 1) it isn't needed, and 2) the DistinctBy in 1.2.0 of DataFu has a bug when handling fields containing dashes, as the time field does.
data_sessionized = FOREACH data_sessionized GENERATE user,ip,session_id;
data_sessionized = FOREACH (GROUP data_sessionized BY (user, session_id)) GENERATE
group.user as user,
SIZE(DistinctBy(data_sessionized)) as distinctIpCount;
Now select all the sessions that had more than one distinct IP address and return the distinct users for these sessions.
data_sessionized = FILTER data_sessionized BY distinctIpCount > 1;
data_sessionized = FOREACH data_sessionized GENERATE user;
data_sessionized = DISTINCT data_sessionized;
This produces simply:
Harold
Here is the full source code, which you should be able to paste directly into the DataFu unit tests and run:
/**
define Sessionize datafu.pig.sessions.Sessionize('5s');
define DistinctBy datafu.pig.bags.DistinctBy('1'); -- distinct by ip
data = LOAD 'input' using PigStorage(',') AS (user:chararray,time:chararray,ip:chararray);
data = FOREACH data GENERATE time,user,ip;
data_sessionized = FOREACH (GROUP data BY user) {
views = ORDER data BY time;
GENERATE flatten(Sessionize(views)) as (time,user,ip,session_id);
}
data_sessionized = FOREACH data_sessionized GENERATE user,ip,session_id;
data_sessionized = FOREACH (GROUP data_sessionized BY (user, session_id)) GENERATE
group.user as user,
SIZE(DistinctBy(data_sessionized)) as distinctIpCount;
data_sessionized = FILTER data_sessionized BY distinctIpCount > 1;
data_sessionized = FOREACH data_sessionized GENERATE user;
data_sessionized = DISTINCT data_sessionized;
STORE data_sessionized INTO 'output';
*/
#Multiline private String sessionizeUserIpTest;
private String[] sessionizeUserIpTestData = new String[] {
"Harold,2014-02-18T16:14:49.503Z,123.123.123.123",
"Harold,2014-02-18T16:14:51.503Z,123.123.123.123",
"Harold,2014-02-18T16:14:55.503Z,321.321.321.321",
"Kumar,2014-02-18T16:14:49.503Z,123.123.123.123",
"Kumar,2014-02-18T16:14:55.503Z,123.123.123.123",
"Kumar,2014-02-18T16:15:05.503Z,321.321.321.321"
};
#Test
public void sessionizeUserIpTest() throws Exception
{
PigTest test = createPigTestFromString(sessionizeUserIpTest);
this.writeLinesToFile("input",
sessionizeUserIpTestData);
List<Tuple> result = this.getLinesForAlias(test, "data_sessionized");
assertEquals(result.size(),1);
assertEquals(result.get(0).get(0),"Harold");
}

Difference between storing data as a key and as property of a hash object

Right now, I'm storing user objects as follows:
user1 = ( id: 1, name: "bob")
user2 = { id: 2, name: "steve"}
HMSET "user:1", user1
HMSET "user:2", user2
HGETALL "user:1" would return the user1 object
HGETALL "user:2" would return the user2 object
I'm wondering if there would be any significant difference (performance or other) if I did:
user1 = ( id: 1, name: "bob")
user2 = { id: 2, name: "steve"}
HSET "USER", 1, JSON.stringify(user1)
HSET "USER", 2, JSON.stringify(user2)
HGET "USER", 1 would give me the string representation of user1 object
HGET "USER", 2 woudl give me the string representation of user2 object
There's not a huge difference either way. It's mostly going to boil down to a design decision based on what you're doing, although whichever you use you should stay consistent throughout the project to avoid confusion.
Here are some pros to method 2:
using JSON could help maintain type consistency
Redis will use less memory and may be a tiny bit faster, since it doesn't have to store or lookup those extra keys
might be easier to think about and work with in code
The main negative for method 2 is summed up in the following example. Say you need to update a user's name. Here's how you would do it with each method.
// Method 1:
HMSET user:1 name newname
// Method 2:
result = JSON.parse(HGET user 1)
result.name = newname
HSET user 1 JSON.stringify(result)

Convert multi-rows values into Collection(List) in LINQ

I am struggling with converting multi-rows values which are belong to the same user into collection.
Here is a simple scenario.
users Table:userid, password
address Table:address, userid
Users Table and Address Table are one-to-many related--one user might have multi-addresses.
Assume the User's ID is 1001 while he/she have two addresses one is in Auckland and another one is Wellington.
I would like select both of them together with user's id.
1001 Auckland
1001 Wellington
So the question is are there any approach is able to put these two value into collection like list.
public class UserDetails{
private List<String> _Address
public string userid{get;set;}
public List<String> Address{
get{retrun _Address;}
set{_Address=value;}
}
}
var user_address= from _user in users
join _address in address on _user.userid=_address.userid
select new userDetails{
userid=_user.userid
**Address.add()**
};
Does anyone know how to construct the List in the LINQ and call the add method.
I want to put the list object into one row so that avoid the redundancy of userid.
Thanks for your help.
Maybe something like this:
var user_address= from _user in users
select new userDetails{
userid=_user.userid,
Address=(from _address in address
where _user.userid=_address.userid
select _address.:address
).ToList()
};
You do not have to join the address table.