Find missing values between two lists of IDs using SQL - sql

I have a large list of IDs that received a certain status (Ex: status is 'EMAIL SENT'). These should be accompanied by another row with another status (Ex: 'EMAIL OPENED').
I have both lists of IDs, a list of all of those that are EMAIL SENT and a list of all of those IDs that also received EMAIL OPENED. What is the best way, using an SQL query, to find the IDs that have EMAIL SENT but have not received EMAIL OPENED?
The IDs I am searching on are not primary keys, there can be multiple IDs in this table each with its own status.
I had though of using EXISTS but this will just give a T/F. The desired result would be a list of IDs that are found in the larger list of EMAIL SENT but not found in the smaller list of EMAIL OPENED.

Related

How to use loop to find related object using Pentaho Data Integration

I want to identify the bad/invalid records so that i can add in a separate SQL Table. For example, we have an account object. And i want to find bad accounts. But i need to apply some filters on contact object. If conditions satisfy based on contact then i want to inserts those invalid account records in SQL Table.
I don't want to directly query from contact. I want to query using account but conditions should be used from contact.
Do anyone knows what is the best way to perform loop in Pentaho? Check each record for contact , if all contact's condition satisfy then add Account id in table. If one of the contact record doesn't satisfy condition. The relevant account should not be added in SQL Table
For Example:
On Account "A" we have 10 contacts
if the email field is empty on all 10 contacts then add Account in SQL table(As bad data)
if on two of contact rcords has email field populated but 8 of them are blank then Account id shouldn't be added in SQL table
How we can better implement this scenario using Pentaho? Any help matters
Thanks
So you can create a transformation similar to this:
You have a query with the different account contacts
Order the query data by account
Group the information by accounts and calculate the maximum ContactMail (so if all mails in contacts are null, the max will be a null, is the result of that step is shown in the Preview data part of my screenshot)
Filter rows by MaxContactMail IS NOT NULL
These could be the basic steps, you'll need to add more steps or perform more than one transformation depending on the complexity of your data.

How can I exclude email addresses from a data extension that are not part of a different data extension in Salesforce Marketing Cloud (using SQL)?

I have a data extension (DE) with email addresses. (A data extension is basically a database table). However I want to take out the email addresses that also show up on a different DE (and the resulting email addresses + other data will have to be stored in a third DE), so basically exclude/suppress them.
I know I can use suppression lists [1] when sending an email, but I need to know how many recipients I will have before I'm actually clicking the send button... I'm looking for a SQL solution.
[1] = https://help.salesforce.com/articleView?id=mc_es_suppression_lists_in_your_send.htm&type=5
I found the answer:
So let's say your table with email addresses is called SourceDE and the table which has the email addresses to be excluded is called ExcludeDE with a field email, then you can do something like:
/* get the fields */
SELECT
SourceDE.email,
SourceDE.firstName
FROM SourceDE
/* join the two tables, but then only keep the rows
where there is no matching email address from the ExcludeDE */
LEFT JOIN ExcludeDE
SourceDE.email on ExcludeDE.email
WHERE
ExcludeDE.email = NULL

How to know how many unread group messages I have

I'm designing a database. This is what I have to represent:
A user can have 0..n friends.
A user can send 0..n messages to a friend.
A user can be member of 0..n groups.
A group can have 1..n members.
A user can send 0..n messages to a group.
To manage conversations between users, and group I have a table (Talk) with these columns:
TalkId (NOT NULL, PK)
Type (NOT NULL, values: UserTalk or GroupTalk)
StarterUserId (NOT NULL, the user that has started the talk).
RecepientUserId: (NULL, the user that has received the first message. NULL if it is a GroupTalk).
DateStarted: (NOT NULL, when the talk has been started).
GroupId: (NULL, the group that owns the talk. NULL if it is a UserTalk)
I also have a Message table to store all the message for each Talk. This Message table has a column Read to indicate that the recipient has read or not the message.
If user 1 sends a message to a user 2, first I check if there is a Talk row with:
((StarterUserI == 1 and RecepientUserId == 2) OR
(StarterUserI == 2 and RecepientUserId == 1))
If there isn't, I create a new row on it. Then, I insert the message in Message table with Message.TalkId pointing to the row that I have created.
My problem is that I don't know how to know how many unread message a user has for a group talk.
For a user talk is easy checking if Message.Read column is false.
To know if a user has unread messages on a group's talk, I can insert the same message for each group member, changing the recipient. For example:
I have a group, with three members. Member 1 send a message to a group. I have to insert a message to user 2, and the same message to user 3:
But, this can make grow Message table very fast.
I've thought to add new two columns to Talk table, the date for the last message sent to that talk, and the id of user that has sent that last message. If I have the date and the ID for the last message in a talk, I can check if there are new messages, but I can't know how many.
I have also a UserGroup table to store the users that are members of a group, and the users' groups. I can add a new column to this table to store how many messages a user has for a group talk. Every time another user send a message to that group, I'm going to insert a new row on Message table, and increase the value on UserGroup.Unread by one. But I think I'm going to mess the design.
How can I know how many unread message a user has for a Group Talk?
You can add a new table MessageStatus with the columns UserID, MessageID and Read where you add one row for each recipient of a message (UserTalk or GroupTalk). This avoids the redundancies you would introduce when duplicating rows in the Message table.
For convenience you could introduce an INSERT-trigger on Message to create the rows in MessageStatus.

Get the first message of every conversation

I have one table to store all the messages sent inside of a xmpp service. I'm looking to create a query to get all the conversations and the first message of it (like whatsapp in chat logs).
Here is my table.
FromPersonId and ToPersonId are ids for people. What I do is, for example I want to see all the conversations of the personId = 643
SELECT DISTINCT MA.FromPersonId, MA.ToPersonId, MAX(MA.SENTDATE) AS [Date], Body
FROM MessageArchive AS MA
WHERE MA.FromPersonId = #personId OR MA.ToPersonId = #personId
GROUP BY MA.FromPersonId, MA.ToPersonId, Body
ORDER BY [Date] DESC
Above is what I have. And the result is
As you see, the result is for the same conversation. But cannot distinguish that is the same conversation because are the same people but in different position.
How can I fix this?
You miss the 644 to 643 message, supposing it exists, What I recommend is to put a ROW ID autoincremental, this columns can give you exact information about what records come first and what records come after, besides, How do you identify that the message is the same ?
You are missing a 'conversation' table, with a conversationID field being a foreign key in your MessageArchive table, as a manifestation of the 'one-to-many' relation existing between the conversation entity and the message entity: one conversation holds at least one message, and each message relates to one, and only one, conversation.
With such a database model, you would be able to collect the 'top 1' message of each conversation.

Database Model - SQL - Best Approach

I'm looking for help with part of a database design.
I have to Model the Database for a Group of Contacts and a Group of Distribution Lists.
Each Contact can be in many Distribution Lists and each Distribution List can have many Contacts. In a normal Instance, I could use a Junction table to achieve this solution.
But there's one more thing to add. Contacts have the option to receive notifications via two different methods which are SMS or Email.
A Contact can request to be sent notification via either or both methods.
The piece of the problem that I am stuck with, is that a Contact may wish to receive notifications differently depending on the specific distribution list.
So we have a problem like this :--
CONTACT A is in DL-A - Receives Notification via SMS
CONTACT A is in DL-B - Recieves Notification via Email & SMS.
I'm trying to avoid having more than one entry for a Contact in my Contacts Table, each contact should be unique.
Can Anyone help?
You could use another junction table:
contactid, distributionlistid, messagepreference
Messagepreference can be email or SMS. Two rows if they want both. New messaging types can be added with no changes to the DB. To be safe, use constants in your code to represent the values you will put into the columns.
Or, add sendemail and sendsms columns to the original junction table, but this has the drawback that you have to change DB structure if you introduce a new messaging type.
So you can add to fields in the junction table:
ContactsDistributions(ContactId, DistributionId, SMSFlag, EmailFlag)
in order to specify the type of notification choosen by the contact for each distribution.
You can add one more field in junction table which will represent how given contact will receive notification from given distribution list.
In this case I would add two fields to the junction table SMS and email both boolean and set to true if and only if they wish to receive notification in that matter. This allows the notifications to be set differently for combination list and contact.
Also depending on how you want to deal with removal from lists you could add a constraint on the junction table so that at least one of the two fields is true so that a notification is always sent although say in google groups I do have access to some lists which I have chosen not to get notifications from.