LDAP Query for all uniquemembers of a group - ldap

I can accomplish this with memberOf but due to API constraints with Okta, this goes over our rate limit and is too slow for production use. I am trying to do this with uniqueMember as this supports indexing but I am having a real hard time getting this to query correctly.
Here is what I have and I believe the issue is how I am thinking of uniqueMember. I believe it is just an attribute for the groups but I don't know how to query a list of groups to get the uniqueMembers.
(&(objectClass=inetOrgPerson)(|(uniqueMember=CN=groupnamehere,ou=groups,dc=myorg,dc=okta,dc=com)
The memberof query that works
(&(objectClass=inetOrgPerson)(|(memberOf=CN=group1,ou=groups,dc=myorg,dc=okta,dc=com)

Related

How to restrict wild card search on Oracle Unified Directory server for read only user

We have Oracle Unified Directory(OUD) with around 10 million users. We have read-only account that is used to for search operations on base dn. But sometimes this user account is being used to perform wildcard search like (uid=*) on base dn. This is causing timeout issues on OUD and impacting performance also. Is there a way to restrict this wildcard search on base dn?
You can set resource limited on the user that is performing the wildcard search.
This will of course cause all the queries they run to be limited to the amount of entries set and have unknown consequences in whatever app is using the returned data. It’s probably preferable to educate them on how to perform paged searches so you can still define limits and they can still consume the data.
Create an ldif file and run ldapmodify with limits suited for your environment.
dn: uid=baduser,ou=people,dc=example,dc=com
changetype: modify
add: ds-rlim-lookthrough-limit
ds-rlim-lookthrough-limit: 10000
-
add: ds-rlim-size-limit
ds-rlim-size-limit: 5000
-
add: ds-rlim-time-limit
ds-rlim-time-limit: 300

Trouble Looking For Events WITHIN a Session In BigQuery or WITHIN Multiple Sessions

I wanted to get a second pair of eyes & some help confirming the best way to look within a session at the hit level in BigQuery. I have read the BigQuery developer documentation thoroughly that provides insight on working WITHIN as session. My challenge is this. Let us assume I write the high level query to count the number of sessions that exist and group the sessions by the device.device category as below:
SELECT device.deviceCategory,
COUNT(DISTINCT CONCAT (fullVisitorId, STRING (visitId)), 10000000) AS SESSIONS
FROM (TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2015-01-01'), TIMESTAMP('2015-06-30')))
GROUP EACH BY device.deviceCategory
ORDER BY sessions DESC
I then run a follow up query like the following to find the number of distinct users (Client ID's):
SELECT device.deviceCategory,
COUNT(DISTINCT fullVisitorID) AS USERS
FROM (TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2015-01-01'), TIMESTAMP('2015-06-30')))
GROUP EACH BY device.deviceCategory
ORDER BY users DESC
(Note that I broke those up because of the sheer size of the data I am working with which produces runs greater than 5TB in some cases).
My challenge is the following. I feel like I have the wrong approach and have not had success with the WITHIN function. For every user ID (or full visitor ID), I want to look within all their various sessions to find out how many sessions from the many they had were desktop and how many were mobile. Basically, these are the cross device users. I want to collect a table with these users. I started here:
SELECT COUNT(DISTINCT CONCAT (fullVisitorId, STRING (visitId)), 10000000) AS SESSIONS
FROM (TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2015-01-01'), TIMESTAMP('2015-06-30')))
WHERE device.deviceCategory = 'desktop' AND device.deviceCategory = 'mobile'
This is not correct though. Moreover, any version I write of a within query is giving me non-sense results or results that have 0 as the number. Does anyone have any strategies or tips to recommend a way forward here? What is the best way to use the WITHIN function to look for sessions that may have multiple events happening WITHIN the session (with my goal being collecting the user ID's that meet certain requirements within a session or over various sessions). Two days ago I did this in a very manual way by manually working through the steps and saving intermediate data frames to generate counts. That said, I wanted to see if there was any guidance to quickly do this using a single query?
I'm not sure if this question is still open on your end, but I believe I see your problem, and it is not with the misuse of the WITHIN function. It is a data understanding problem.
When dealing with GA and cross-device identification, you cannot reliably use any combination of fullVisitorId and visitId to identify users, as these are derived from the cookie that GA places on the users browser. Thus, leveraging the fullVisitorId would identify a specific browser on a specific device more accurately that a specific user.
In order to truly track users across devices, you must be able to leverage the userId functionality follow this link. This requires you to have the user sign in in some way, thus giving them an identifier that you can use across all of their devices and tie their behavior together.
After you implement some type of user identification that you can control, rather than GA's cookie assignment, you can use that to look for details across sessions and within those individual sessions.
Hope that helps!

how to list job ids from all users?

I'm using the Java API to query for all job ids using the code below
Bigquery.Jobs.List list = bigquery.jobs().list(projectId);
list.setAllUsers(true);
but it doesn't list me job ids that were run by Client ID for web applications (ie. metric insights) I'm using private key authentication.
Using the command line tool 'bq ls -j' in turn giving me only the metric insight job ids but not the ones ran with the private key auth. Is there a get all method?
The reason I'm doing this is trying to get better visibility into what queries are eating up our data usage. We have multiple sources of queries: metric insights, in house automation, some done manually, etc.
As of version 2.0.10, the bq client has support for API authorization using service account credentials. You can specify using a specific service account with the following flags:
bq --service_account your_service_account_here#developer.gserviceaccount.com \
--service_account_credential_store my_credential_file \
--service_account_private_key_file mykey.p12 <your_commands, etc>
Type bq --help for more information.
My hunch is that listing jobs for all users is broken, and nobody has mentioned it since there is usually a workaround. I'm currently investigating.
Jordan -- It sounds like you're honing in on what we want to do. For all access that we've allowed into our project/dataset we want to produce an aggregate/report of the "totalBytesProcessed" for all queries executed.
The problem we're struggling with is that we have a handful of distinct java programs accessing our data, a 3rd party service (metric insights) and 7-8 individual users who have query access via the web interface. Fortunately the incoming data only has one source so explaining the cost for that is simple. For queries though I am kinda blind at the moment (and it appears queries will be the bulk of the monthly bill).
It would be ideal if I can get the underyling data for this report with just one listing made with some single top level auth. With that I think from the timestamps and the actual SQL text I can attribute each query to a source.
One thing that might make this problem far easier is if there were more information in the job record (or some text adornment in the job_id for queries). I don't see that I can assign my own jobIDs on queries (perhaps I missed it?) and perhaps recording some source information in the job record would be possible? Just thinking out loud now...
There are three tables you can query for this.
region-**.INFORMATION_SCHEMA.JOBS_BY_{USER, PROJECT, ORGANIZATION}
Where ** should be replaced by your region.
Example query for JOBS_BY_USER in the eu region:
select
count(*) as num_queries,
date(creation_time) as date,
sum(total_bytes_processed) as total_bytes_processed,
sum(total_slot_ms) as total_slot_ms_cost
from
`region-eu.INFORMATION_SCHEMA.JOBS_BY_USER` as jobs_by_user,
jobs_by_user.referenced_tables
group by
2
order by 2 desc, total_bytes_processed desc;
Documentation is available at:
https://cloud.google.com/bigquery/docs/information-schema-jobs

LDAP filter boolean expression maximum number of arguments

I was writing a small test case to see what's more efficient, multiple small queries or a single big query, when I encountered this limitation.
The query looks like this:
(| (clientid=1) (clientid=2) (clientid=3) ...)
When the number of clients goes beyond 2103 ?! the LDAP server throws an error:
error code 1 - Operations Error
As far as I can tell the actual filter string length does not matter ~69KB (at least for Microsoft AD the length limit is 10MB). I tried with longer attribute names and got the same strange limit: 2103 operands
Does anyone have more information about this limitation?
Is this something specified in the LDAP protocol specification or is it implementation specific?
Is it configurable?
I tested this against IBM Tivoli Directory Server V6.2 using both the UnboundID and JNDI Java libraries.
It cannot be more than 8099 characters. See http://www-01.ibm.com/support/docview.wss?uid=swg21295980
Also, what you are doing is not a good practice. If there are common attributes these entries share (e.g., country code, department number, location, etc.), try to retrieve the results using common criteria given you by those attributes. If not, divide your search filter into smaller ones each of which is with few predicates and execute multiple searches. It depends the programming language you're using to do this, but try to execute each search in a separate thread to speed up your data retrieval process.

What does the ISPALUSER function call in the Msmerge_*_VIEW views do?

I'm trying to understand how SQL Server 2005 replication works, and I'm looking at the views titled Msmerge_[Publication]_[Table]_VIEW. These views seems to define the merge filters, and are pretty straight forward, except for one line of sql in the WHERE clause:
AND ({fn ISPALUSER('1A381615-B57D-4915-BA4B-E16BF7A9AC58')} = 1)
What does the ISPALUSER function do? I cannot seem to find it anywhere under functions in management studio, nor really any mention of it on the web.
(The reason I'm looking at these views is that we have a performance issue when a client replicates up new records. Sql like if not exists (select 1 from [MSmerge_[Publication]_[Table]_VIEW] where [rowguid] = #rowguid) is running and taking 10+ seconds per row, which obiously kills performance when you have more than a couple of rows going up)
Seems its checking if the user is in the special security role MSmerge_PAL_role, which seems to govern who has access to the replication functionality.
Therefore, ISPALUSER checks if the user is in that specific role.
Still not sure what the PAL stands for.