Count of how many times id occurs in table SQL regexp - sql

Hi I have a redshift table of articles that has a field on it that can contain many accounts. So there is a one to many relationship between articles to accounts.
However I want to create a new view where it lists the partner id's in one column and in another column a count of how many times the partner id appears in the articles table.
I've attempted to do this using regex and created a new redshift view, but am getting weird results where it doesn't always build properly. So one day it will say a partner appears 15 times, then the next 17, then the next 15, when the partner id count hasn't actually changed.
Any help would be greatly appreciated.
SELECT partner_id,
COUNT(DISTINCT id)
FROM (SELECT id,
partner_ids,
SPLIT_PART(partner_ids,',',i) partner_id
FROM positron_articles a
LEFT JOIN util.seq_0_to_500 s
ON s.i < regexp_count (partner_ids,',') + 2
OR s.i = 1
WHERE i > 0
AND regexp_count (partner_ids,',') = 0
ORDER BY id)
GROUP BY 1;

Let's start with some of the more obvious things and see if we can start to glean other information.
Next GROUP BY 1 on your outer query needs to be GROUP BY partner_id.
Next you don't need an order by in your INNER query and the database engine will probably do a better job optimizing performance without it so remove ORDER BY id.
If you want your final results to be ordered then add an ORDER BY partner_id or similar clause after your group by of your OUTER query.
It looks like there are also problems with how you are splitting a partnerid from partnerids but I am not positive about that because I need to understand your view and the data it provides to know how that affects your record count for partnerid.
Next your LEFT JOIN statement on the util.seq_0_to_500 I am pretty sure you can drop off the s.i = 1 as the first condition will satisfy that as well because 2 is greater than 1. However your left join really acts more like an inner join because you then exclude any non matches from positron_articles that don't have a s.i > 0.
Oddly then your entire join and inner query gets kind of discarded because you only want articles that have no commas in their partnerids: regexp_count (partner_ids,',') = 0
I would suggest posting the code for your util.seq_0_to_500 and if you have a partner table let use know about that as well because you can probably get your answer a lot easier with that additional table depending on how regexp_count works. I suspect regex_count(partnerids,partnerid) exampleregex_count('12345,678',1234) will return greater than 0 at which point you have no choice but to split the delimited strings into another table before counting or building a new matching function.
If regex_count only matches exact between commas and you have a partner table your query could be as easy as this:
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
positron_articles a
LEFT JOIN PARTNERTABLE p
ON regexp_count(a.partnerids,p.partnerid) > 0
GROUP BY
p.partner_id
I will actually correct myself as I just thought of a way to join a partner table without regexp_count. So if you have a partner table this might work for you. If not you will need to split strings. It basically tests to see if the partnerid is the entire partnerids, at the beginning, in the middle, or at the end of partnerids. If one of those is met then the records is returned.
SELECT
p.partner_id
,COUNT(a.id) AS ArticlesAppearedIn
FROM
PARTNERTABLE p
INNER JOIN positron_articles a
ON
(
CASE
WHEN a.partnerids = CAST(p.partnerid AS VARCHAR(100)) THEN 1
WHEN a.partnerids LIKE p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid + ',%' THEN 1
WHEN a.partnerids LIKE '%,' + p.partnerid THEN 1
ELSE 0
END
) = 1
GROUP BY
p.partner_id

Related

Should I use an SQL full outer join for this?

Consider the following tables:
Table A:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
Table B:
DOC_NUM
DOC_TYPE
RELATED_DOC_NUM
NEXT_STATUS
...
The DOC_TYPE and NEXT_STATUS columns have different meanings between the two tables, although a NEXT_STATUS = 999 means "closed" in both. Also, under certain conditions, there will be a record in each table, with a reference to a corresponding entry in the other table (i.e. the RELATED_DOC_NUM columns).
I am trying to create a query that will get data from both tables that meet the following conditions:
A.RELATED_DOC_NUM = B.DOC_NUM
A.DOC_TYPE = "ST"
B.DOC_TYPE = "OT"
A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999
A.DOC_TYPE = "ST" represents a transfer order to transfer inventory from one plant to another. B.DOC_TYPE = "OT" represents a corresponding receipt of the transferred inventory at the receiving plant.
We want to get records from either table where there is an ST/OT pair where either or both entries are not closed (i.e. NEXT_STATUS < 999).
I am assuming that I need to use a FULL OUTER join to accomplish this. If this is the wrong assumption, please let me know what I should be doing instead.
UPDATE (11/30/2021):
I believe that #Caius Jard is correct in that this does not need to be an outer join. There should always be an ST/OT pair.
With that I have written my query as follows:
SELECT <columns>
FROM A LEFT JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
Does this make sense?
UPDATE 2 (11/30/2021):
The reality is that these are DB2 database tables being used by the JD Edwards ERP application. The only way I know of to see the table definitions is by using the web site http://www.jdetables.com/, entering the table ID and hitting return to run the search. It comes back with a ton of information about the table and its columns.
Table A is really F4211 and table B is really F4311.
Right now, I've simplified the query to keep it simple and keep variables to a minimum. This is what I have currently:
SELECT CAST(F4211.SDDOCO AS VARCHAR(8)) AS SO_NUM,
F4211.SDRORN AS RELATED_PO,
F4211.SDDCTO AS SO_DOC_TYPE,
F4211.SDNXTR AS SO_NEXT_STATUS,
CAST(F4311.PDDOCO AS VARCHAR(8)) AS PO_NUM,
F4311.PDRORN AS RELATED_SO,
F4311.PDDCTO AS PO_DOC_TYPE,
F4311.PDNXTR AS PO_NEXT_STATUS
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = CAST(F4311.PDDOCO AS VARCHAR(8))
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
The other part of the story is that I'm using a reporting package that allows you to define "virtual" views of the data. Virtual views allow the report developer to specify the SQL to use. This is the application where I am using the SQL. When I set up the SQL, there is a validation step that must be performed. It will return a limited set of results if the SQL is validated.
When I enter the query above and validate it, it says that there are no results, which makes no sense. I'm guessing the data casting is causing the issue, but not sure.
UPDATE 3 (11/30/2021):
One more twist to the story. The related doc number is not only defined as a string value, but it contains leading zeros. This is true in both tables. The main doc number (in both tables) is defined as a numeric value and therefore has no leading zeros. I have no idea why those who developed JDE would have done this, but that is what is there.
So, there are matching records between the two tables that meet the criteria, but I think I'm getting no results because when I convert the numeric to a string, it does not match, because one value is, say "12345", while the other is "00012345".
Can I pad the numeric -> string value with zeros before doing the equals check?
UPDATE 4 (12/2/2021):
Was able to finally get the query to work by converting the numeric doc num to a left zero padded string.
SELECT <columns>
FROM PROD2DTA.F4211 AS F4211
INNER JOIN PROD2DTA.F4311 AS F4311
ON F4211.SDRORN = RIGHT(CONCAT('00000000', CAST(F4311.PDDOCO AS VARCHAR(8))), 8)
WHERE F4211.SDDCTO IN ( 'ST' )
AND F4311.PDDCTO IN ( 'OT' )
AND ( F4211.SDNXTR < 999
OR F4311.PDNXTR < 999 )
You should write your query as follows:
SELECT <columns>
FROM A INNER JOIN B
ON
A.RELATED_DOC_NUM = B.DOC_NUM
WHERE
A.DOC_TYPE IN ('ST') AND
B.DOC_TYPE IN ('OT') AND
(A.NEXT_STATUS < 999 OR B.NEXT_STATUS < 999)
LEFT join is a type of OUTER join; LEFT JOIN is typically a contraction of LEFT OUTER JOIN). OUTER means "one side might have nulls in every column because there was no match". Most critically, the code as posted in the question (with a LEFT JOIN, but then has WHERE some_column_from_the_right_table = some_value) runs as an INNER join, because any NULLs inserted by the LEFT OUTER process, are then quashed by the WHERE clause
See Update 4 for details of how I resolved the "data conversion or mapping" error.

SQL Filtering duplicate rows due to bad ETL

The database is Postgres but any SQL logic should help.
I am retrieving the set of sales quotations that contain a given product within the bill of materials. I'm doing that in two steps: step 1, retrieve all DISTINCT quote numbers which contain a given product (by product number).
The second step, retrieve the full quote, with all products listed for each unique quote number.
So far, so good. Now the tough bit. Some rows are duplicates, some are not. Those that are duplicates (quote number & quote version & line number) might or might not have maintenance on them. I want to pick the row that has maintenance greater than 0. The duplicate rows I want to exclude are those that have a 0 maintenance. The problem is that some rows, which have no duplicates, have 0 maintenance, so I can't just filter on maintenance.
To make this exciting, the database holds quotes over 20+ years. And the data scientists guys have just admitted that maybe the ETL process has some bugs...
--- step 0
--- cleanup the workspace
SET CLIENT_ENCODING TO 'UTF8';
DROP TABLE IF EXISTS product_quotes;
--- step 1
--- get list of Product Quotes
CREATE TEMPORARY TABLE product_quotes AS (
SELECT DISTINCT master_quote_number
FROM w_quote_line_d
WHERE item_number IN ( << model numbers >> )
);
--- step 2
--- Now join on that list
SELECT
d.quote_line_number,
d.item_number,
d.item_description,
d.item_quantity,
d.unit_of_measure,
f.ref_list_price_amount,
f.quote_amount_entered,
f.negtd_discount,
--- need to calculate discount rate based on list price and negtd discount (%)
CASE
WHEN ref_list_price_amount > 0
THEN 100 - (ref_list_price_amount + negtd_discount) / ref_list_price_amount *100
ELSE 0
END AS discount_percent,
f.warranty_months,
f.master_quote_number,
f.quote_version_number,
f.maintenance_months,
f.territory_wid,
f.district_wid,
f.sales_rep_wid,
f.sales_organization_wid,
f.install_at_customer_wid,
f.ship_to_customer_wid,
f.bill_to_customer_wid,
f.sold_to_customer_wid,
d.net_value,
d.deal_score,
f.transaction_date,
f.reporting_date
FROM w_quote_line_d d
INNER JOIN product_quotes pq ON (pq.master_quote_number = d.master_quote_number)
INNER JOIN w_quote_f f ON
(f.quote_line_number = d.quote_line_number
AND f.master_quote_number = d.master_quote_number
AND f.quote_version_number = d.quote_version_number)
WHERE d.net_value >= 0 AND item_quantity > 0
ORDER BY f.master_quote_number, f.quote_version_number, d.quote_line_number
The logic to filter the duplicate rows is like this:
For each master_quote_number / version_number pair, check to see if there are duplicate line numbers. If so, pick the one with maintenance > 0.
Even in a CASE statement, I'm not sure how to write that.
Thoughts? The database is Postgres but any SQL logic should help.
I think you will want to use Window Functions. They are, in a word, awesome.
Here is a query that would "dedupe" based on your criteria:
select *
from (
select
* -- simplifying here to show the important parts
,row_number() over (
partition by master_quote_number, version_number
order by maintenance desc) as seqnum
from w_quote_line_d d
inner join product_quotes pq
on (pq.master_quote_number = d.master_quote_number)
inner join w_quote_f f
on (f.quote_line_number = d.quote_line_number
and f.master_quote_number = d.master_quote_number
and f.quote_version_number = d.quote_version_number)
) x
where seqnum = 1
The use of row_number() and the chosen partition by and order by criteria guarantee that only ONE row for each combination of quote_number/version_number will get the value of 1, and it will be the one with the highest value in maintenance (if your colleagues are right, there would only be one with a value > 0 anyway).
Can you do something like...
select
*
from
w_quote_line_d d
inner join
(
select
...
,max(maintenance)
from
w_quote_line_d
group by
...
) d1
on
d1.id = d.id
and d1.maintenance = d.maintenance;
Am I understanding your problem correctly?
Edit: Forgot the group by!
I'm not sure, but maybe you could Group By all other columns and use MAX(Maintenance) to get only the greatest.
What do you think?

Include missing years in Group By query

I am fairly new in Access and SQL programming. I am trying to do the following:
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
and group by year even when there is no amount in some of the years. I would like to have these years listed as well for a report with charts. I'm not certain if this is possible, but every bit of help is appreciated.
My code so far is as follows:
SELECT
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
FROM
Base_CustomerT
INNER JOIN (
SO_SalesOrderPaymentHistoryLineT
INNER JOIN SO_SalesOrderT
ON SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId
) ON Base_CustomerT.CustomerId = SO_SalesOrderT.CustomerId
GROUP BY
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
SO_SalesOrderPaymentHistoryLineT.PaymentType,
Base_CustomerT.IsActive
HAVING
(((SO_SalesOrderPaymentHistoryLineT.PaymentType)=1)
AND ((Base_CustomerT.IsActive)=Yes))
ORDER BY
Base_CustomerT.SalesRep,
Base_CustomerT.Customer;
You need another table with all years listed -- you can create this on the fly or have one in the db... join from that. So if you had a table called alltheyears with a column called y that just listed the years then you could use code like this:
WITH minmax as
(
select min(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as minyear,
max(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as maxyear)
from SalesOrderPaymentHistoryLineT
), yearsused as
(
select y
from alltheyears, minmax
where alltheyears.y >= minyear and alltheyears.y <= maxyear
)
select *
from yearsused
join ( -- your query above goes here! -- ) T
ON year(T.SO_SalesOrderPaymentHistoryLineT.DatePaid) = yearsused.y
You need a data source that will provide the year numbers. You cannot manufacture them out of thin air. Supposing you had a table Interesting_year with a single column year, populated, say, with every distinct integer between 2000 and 2050, you could do something like this:
SELECT
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
Sum(NZ(data.Amount)) AS [Sum Of PaymentPerYear]
FROM
(SELECT * FROM Base_CustomerT INNER JOIN Year) AS base
LEFT JOIN
(SELECT * FROM
SO_SalesOrderT
INNER JOIN SO_SalesOrderPaymentHistoryLineT
ON (SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId)
) AS data
ON ((base.CustomerId = data.CustomerId)
AND (base.year = Year(data.DatePaid))),
WHERE
(data.PaymentType = 1)
AND (base.IsActive = Yes)
AND (base.year BETWEEN
(SELECT Min(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT)
AND (SELECT Max(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT))
GROUP BY
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
ORDER BY
base.SalesRep,
base.Customer;
Note the following:
The revised query first forms the Cartesian product of BaseCustomerT with Interesting_year in order to have base customer data associated with each year (this is sometimes called a CROSS JOIN, but it's the same thing as an INNER JOIN with no join predicate, which is what Access requires)
In order to have result rows for years with no payments, you must perform an outer join (in this case a LEFT JOIN). Where a (base customer, year) combination has no associated orders, the rest of the columns of the join result will be NULL.
I'm selecting the CustomerId from Base_CustomerT because you would sometimes get a NULL if you selected from SO_SalesOrderT as in the starting query
I'm using the Access Nz() function to convert NULL payment amounts to 0 (from rows corresponding to years with no payments)
I converted your HAVING clause to a WHERE clause. That's semantically equivalent in this particular case, and it will be more efficient because the WHERE filter is applied before groups are formed, and because it allows some columns to be omitted from the GROUP BY clause.
Following Hogan's example, I filter out data for years outside the overall range covered by your data. Alternatively, you could achieve the same effect without that filter condition and its subqueries by ensuring that table Intersting_year contains only the year numbers for which you want results.
Update: modified the query to a different, but logically equivalent "something like this" that I hope Access will like better. Aside from adding a bunch of parentheses, the main difference is making both the left and the right operand of the LEFT JOIN into a subquery. That's consistent with the consensus recommendation for resolving Access "ambiguous outer join" errors.
Thank you John for your help. I found a solution which works for me. It looks quiet different but I learned a lot out of it. If you are interested here is how it looks now.
SELECT DISTINCTROW
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
FROM
Base_Customer_RevenueYearQ
LEFT JOIN CustomerPaymentPerYearQ
ON (Base_Customer_RevenueYearQ.RevenueYear = CustomerPaymentPerYearQ.[RevenueYear])
AND (Base_Customer_RevenueYearQ.CustomerId = CustomerPaymentPerYearQ.CustomerId)
GROUP BY
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
;

SQL SUM function doubling the amount it should using multiple tables

My query below is doubling the amount on the last record it returns. I have 3 tables - activities, bookings and tempbookings. The query needs to list the activities and attached information and pull the total number (using the SUM) of places booked (as BookingTotal) from the booking table by each activity and then it needs to calculate the same for tempbookings (as tempPlacesReserved) providing the reservedate field inside that table is in the future.
However the first issue is that if there are no records for an activity in the tempbookings table it does not return any records for that activity at all, to get around this i created dummy records in the past so that it still returns the record, but if I can make it so I don't have to do this I would prefer it!
The main issue I have is that on the final record of the returned results it doubles the booking total and the places reserved which of course makes the whole query useless.
I know that I am doing something wrong I just haven't been able to sort it, I have searched similar issues online but am unable to apply them to my situation correctly.
Any help would be appreciated.
P.S. I'm aware that normally you wouldn't need to fully label all the paths to the databases, tables and fields as I have but for the program I am planning to use it in I have to do it this way.
Code:
SELECT [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice],
SUM([LeisureActivities].[dbo].[bookings].[bookingPlaces]) AS 'bookingTotal',
SUM (CASE WHEN[LeisureActivities].[dbo].[tempbookings].[tempReserveDate] > GetDate() THEN [LeisureActivities].[dbo].[tempbookings].[tempPlaces] ELSE 0 end) AS 'tempPlacesReserved'
FROM [LeisureActivities].[dbo].[activities],
[LeisureActivities].[dbo].[bookings],
[LeisureActivities].[dbo].[tempbookings]
WHERE ([LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[bookings].[activityID]
AND [LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[tempbookings].[tempActivityID])
AND [LeisureActivities].[dbo].[activities].[activityDate] > GetDate ()
GROUP BY [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice];
Your current query is using an INNER JOIN between each of the tables so if the tempBookings table has no records, you will not return anything.
I would advise that you start to use JOIN syntax. You might also need to use subqueries to get the totals.
SELECT a.[activityID],
a.[activityName],
a.[activityDate],
a.[activityPlaces],
a.[activityPrice],
coalesce(b.bookingTotal, 0) bookingTotal,
coalesce(t.tempPlacesReserved, 0) tempPlacesReserved
FROM [LeisureActivities].[dbo].[activities] a
LEFT JOIN
(
select activityID,
SUM([bookingPlaces]) AS bookingTotal
from [LeisureActivities].[dbo].[bookings]
group by activityID
) b
ON a.[activityID]=b.[activityID]
LEFT JOIN
(
select tempActivityID,
SUM(CASE WHEN [tempReserveDate] > GetDate() THEN [tempPlaces] ELSE 0 end) AS tempPlacesReserved
from [LeisureActivities].[dbo].[tempbookings]
group by tempActivityID
) t
ON a.[activityID]=t.[tempActivityID]
WHERE a.[activityDate] > GetDate();
Note: I am using aliases because it is easier to read
Use new SQL-92 Join syntax, and make join to tempBookings an outer join. Also clean up your sql with table aliases. Makes it easier to read. As to why last row has doubled values, I don't know, but on off chance that it is caused by extra dummy records you entered. get rid of them. That problem is fixed by using outer join to tempBookings. The other possibility is that the join conditions you had to the tempBookings table(t.tempActivityID = a.activityID) is insufficient to guarantee that it will match to only one record in activities table... If, for example, it matches to two records in activities, then the rows from Tempbookings would be repeated twice in the output, (causing the sum to be doubled)
SELECT a.activityID, a.activityName, a.activityDate,
a.activityPlaces, a.activityPrice,
SUM(b.bookingPlaces) bookingTotal,
SUM (CASE WHEN t.tempReserveDate > GetDate()
THEN t.tempPlaces ELSE 0 end) tempPlacesReserved
FROM LeisureActivities.dbo.activities a
Join LeisureActivities.dbo.bookings b
On b.activityID = a.activityID
Left Join LeisureActivities.dbo.tempbookings t
On t.tempActivityID = a.activityID
WHERE a.activityDate > GetDate ()
GROUP BY a.activityID, a.activityName,
a.activityDate, a.activityPlaces,
a.activityPrice;

outer query to list only if its rowcount equates to inner subquery

Need help on a query using sql server 2005
I am having two tables
code
chargecode
chargeid
orgid
entry
chargeid
itemNo
rate
I need to list all the chargeids in entry table if it contains multiple entries having different chargeids
which got listed in code table having the same charge code.
data :
code
100,1,100
100,2,100
100,3,100
101,11,100
101,12,100
entry
1,x1,1
1,x2,2
2,x3,2
11,x4,1
11,x5,1
using the above data , it query should list chargeids 1 and 2 and not 11.
I got the way to know how many rows in entry satisfies the criteria, but m failing to get the chargeids
select count (distinct chargeId)
from entry where chargeid in (select chargeid from code where chargecode = (SELECT A.chargecode
from code as A join code as B
ON A.chargecode = B.chargeCode and A.chargetype = B.chargetype and A.orgId = B.orgId AND A.CHARGEID = b.CHARGEid
group by A.chargecode,A.orgid
having count(A.chargecode) > 1)
)
First off: I apologise for my completely inaccurate original answer.
The solution to your problem is a self-join. Self-joins are used when you want to select more than one row from the same table. In our case we want to select two charge IDs that have the same charge code:
SELECT DISTINCT c1.chargeid, c2.chargeid FROM code c1
JOIN code c2 ON c1.chargeid != c2.chargeid AND c1.chargecode = c2.chargecode
JOIN entry e1 ON e1.chargeid = c1.chargeid
JOIN entry e2 ON e2.chargeid = c2.chargeid
WHERE c1.chargeid < c2.chargeid
Explanation of this:
First we pick any two charge IDs from 'code'. The DISTINCT avoids duplicates. We make sure they're two different IDs and that they map to the same chargecode.
Then we join on 'entry' (twice) to make sure they both appear in the entry table.
This approach gives (for your example) the pairs (1,2) and (2,1). So we also insist on an ordering; this cuts to result set down to just (1,2), as you described.