NuPIC OPF Runtime error getOutputData unknown output categoriesOut - nupic

I'm trying to run TemporalClassification model using OPF to recognize patterns from stream. I've adjusted model params so it has two Sensor inputs: ScalarEncoder and SDRCategoryEncoder. The latter marked as classifierOnly. And also it's set as predictedField in inferences.
When trying to feed model with input data I get
RuntimeError: getOutputData unknown output 'categoriesOut' on region Classifier.
NontemporalClassification (only inferenceType changed) model runs without such error.
I've found 6 occurances of categoriesOut in nupic code: https://github.com/numenta/nupic/search?utf8=%E2%9C%93&q=categoriesOut
And error arises in nupic/frameworks/opf/clamodel.py at line 558
classificationDist = classifier.getOutputData('categoriesOut')
Seems that ClassifierRegion in the network is not prepared properly to output data.
Can anyone explain why the classfier region doesn't have 'categoriesOut'? I guess there's misconfiguration in my model params, but there were no errors or warnings during initialization of model. Is there any mandatory parameters and assignments (except noticed in NUPIC documentation) necessary for TemporalClassification model to run?

There are several types of ClassifierRegions in NuPIC. You can find them in nupic/regions folder. I've checked sources and found that 'categoriesOut' is in the outputs dict of the KNNClassifierRegion
https://github.com/numenta/nupic/blob/469f6372082e95dd5d2a96181b745ba36d2e7a8a/nupic/regions/KNNClassifierRegion.py
outputs=dict(
categoriesOut=dict(
description='A vector representing, for each category '
'index, the likelihood that the input to the node belongs '
'to that category based on the number of neighbors of '
'that category that are among the nearest K.',
dataType='Real32',
count=0,
regionLevel=True,
isDefaultOutput=True),
Ensure you use KNNClassifierRegion when configuring your TemporalClassification model. Samples for NontemporalClassification use CLAClassifier, but CLAClassifierRegion has no categoriesOut in its outputs and error described in your question will arise if you keep
'regionName' : 'CLAClassifierRegion'
for TemporalClassification model.

Related

GNURadio Companion and OFDM TX and RX in single Graph

I am following this github example for understanding OFDM on gnuradio-companion, I am able to execute ofdm_tx individually (64 and 512 FFT point) without any issues, but when I connect these two in single graph, I am able to get spectrum from ofdm_tx (no output from ofdm_rx or getting straight line).
My question here, each time I close my output spectrum, my tool get hanged and in background (inside gnu-companion) I observe the following message tarin (attached, printscreen). Similar thing also observed when I run ofdm_rx individually.
Error message in Console :
packet_headerparser_b :info: Detected an invalid packet at item 1448.
header_payload_demux :info :parser returned #f
Please guide me in this regard,
by selecting "NO" for vector source "Repeat" variable , issue sorted out (no hang), but not able to see spectrum anymore.

Why is VK_SAMPLE_COUNT_1_BIT an invalid choice for multisampling in Vulkan?

Hello people of StackOverflow,
I am currently working on a games engine using the Vulkan graphics API, in the past I was just setting anti-aliasing to the max it could be. However today I was trying to turn it off (to improve performance on weaker systems). To do this I tried to set the MSAA samples on my engine to VK_SAMPLE_COUNT_1_BIT however this produced the validation error:
Validation Error: [ VUID-VkSubpassDescription-pResolveAttachments-00848 ] Object 0: handle = 0x55aaa6e32828, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0xfad6c3cb | ValidateCreateRenderPass(): Subpass 0 requests multisample resolve from attachment 0 which has VK_SAMPLE_COUNT_1_BIT. The Vulkan spec states: If pResolveAttachments is not NULL, for each resolve attachment that is not VK_ATTACHMENT_UNUSED, the corresponding color attachment must not have a sample count of VK_SAMPLE_COUNT_1_BIT (https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VUID-VkSubpassDescription-pResolveAttachments-00848)
I can work around this problem relatively easily so it isn't really an issue for me, however I was wondering why exactly this limit is put into place. If I want to set the MSAA samples to 1 why can't I?
Thanks,
sckzor
A sample count of 1 means "not a multisampled image". And if you're doing multisample resolve, resolving from a non-multisampled image doesn't make sense. Which is also why you can't use such images for any other things that expect a multisampled image (you can't use an MS-style sampler or texture function on them).

Log messages classification/grouping and finding human readable pattern for each group

As new to data science and machine learning I would like to ask the following questions about the problem explained below:
Is machine learning good for such problem or is it overkill?
Could this problem be related with another classical problem that has already published papers so I can choose the right solution?
The problem:
I've been doing a research on pretty interesting problem that I believe many Analytics system solved by automated process.
We are collecting many JavaScript error messages that happen in all kind of browsers and custom build web applications. Our goal is to group the similar messages and label each group by the common pattern the grouped messages have.
Example:
+---------------------------------------------------------------+
|Label: "Forbidden: User session {{placeholder1}} has expired." |
+---------------------------------------------------------------+
|Message: "Forbidden: User session aad3-1v299-4400 has expired."|
|Message: "Forbidden: User session jj41-1d333-bbaa has expired."|
|Message: "Forbidden: User session aab3-bn12n-1111 has expired."|
+---------------------------------------------------------------+
So far we have semi-automated process that solves the problem but from time to time we get new user generated JavaScript error messages that slip through our filters.
I've been thinking about naive 2 step approach that uses existing libraries/tools/algorithms.
For a batch of error lines run an algorithm (e.g. Levenshtein) that finds similar strings. Group the similar errors.
Within a group of similar strings run a diff and find the dynamic parts. Check the diff:
For reference here we have error messages that were collected in the period of one minute:
Message: 3312445,Error: Unknown page "retina_list"
Message: 9931234,Error: Unknown page "widget_summary"
Message: ReferenceError: 'alg,TypeError: g' is undefined
Message: 522574,Error: Unknown page "page_options"
Message: ReferenceError: '297756| Zly / Error in handler for event:,[object Object],ApiListenerError: TypeError: a' is undefined
Message: [Euv warn]: style="width: {{item.evaluation}}em": interpolation in 'style' attribute will cause the attribute to be discarded in Internet Explorer. Use krt-bind:style instead. (found in component: <default-componentfalse2320383>)
Message: [Euv warn]: src="//www.example.com/image/{{item._id}}-1.jpg?w=220&h=165&mode=crop": interpolation in 'src' attribute will cause a 404 request. Use krt-bind:src instead. (found in component: <default-componentfalse8372912>)
Message: [Euv warn]: src="//www.example.com/image/{{item._id}}?car=recommend_sp312": interpolation in 'src' attribute will cause a 404 request. Use krt-bind:src instead. (found in component: <default-componentfalse3330736>)
Message: [Euv warn]: src="//www.example.com/image/{{item._id}}-1.jpg?w=220&h=165&mode=crop": interpolation in 'src' attribute will cause a 404 request. Use krt-bind:src instead. (found in component: <default-componentfalse4893336>)
Message: ReferenceError: 'alg,TypeError: g' is undefined
Message: 73276| Zly / Error in handler for event:,[object Object],ApiListenerError: TypeError: Cannot read property 'campaignName' of undefined
Message: ReferenceError: 'alg,TypeError: g' is undefined
Message: ReferenceError: 'bend,TypeError: f' is undefined
I've been playing lately with Tensorflow JS where I am complete beginner but I may try to train something that could help me classify strings and label them.
I also think that the more serious problem is to generate the group label than grouping the strings because sometimes a pair of similar strings have very different length and the placeholders are long sentences with special characters like \,".^%#&*!?<>|][{}.
As you have pointed out, it sounds like we can separate this problem into two distinct steps.
Group together similar messages, and
Label each group.
Step 1:
While I am not too familiar with Tensorflow JS, I do not believe it is overkill to use Machine Learning (ML) to tackle this problem, especially for step 1.
In fact, this type of problem is a great candidate for a specific form of ML known as Unsupervised Learning, and more specifically, Clustering. In Unsupervised Learning, we look to find “previously unknown patterns in our data without pre-existing labels”.
See: https://en.wikipedia.org/wiki/Unsupervised_learning
In this context, that means that we do not know if “Error Message 1” and “Error Message 2” will belong to the same group before we apply our Clustering algorithm. Using your example, we can reasonably suspect that the messages:
“Forbidden: User session aad3-1v299-4400 has expired"
“Forbidden: User session jj41-1d333-bbaa has expired"
will belong to the same group, but the Clustering algorithm does not know this when it starts.
We can contrast this with a form of Supervised Learning known as Classification, where we know beforehand that we expect a group to have the form
“Forbidden: User session {{placeholder1}} has expired".
Then the pre-existing labels in the data are that messages such as
“Forbidden: User session aad3-1v299-4400 has expired"
“Forbidden: User session jj41-1d333-bbaa has expired"
belong to the expected group just above. We essentially give the ML model a bunch of examples of what this group looks like, and then incoming messages that appear to be similar will be classified to this group.
It sounds like from your description that for Step 1, you want to perform a string match (such as Levenshtein) to compare all of the example messages, and then apply a Clustering algorithm to those results. Then after you have groups (clusters) of messages, Step 2 involves finding an appropriate label for each group.
Step 2:
Agreed that finding an appropriate label for each group is likely the harder problem here. One approach that could be useful is to count how many times a word or phrase appears within a group or cluster, and if it does not meet some pre-defined threshold, to use a placeholder as you have in your example label. For example, the words “Forbidden”, “User”, “session”, and “expired” will be common to the group, whereas the alpha numeric ID’s listed are unique to the individual messages. If the threshold is that a word or phrase must show up in at least two messages, only the ID’s will be replaced by the placeholder.
In this approach, you are essentially looking to find words or phrases that are uncommon to the group, and do not provide useful information in forming an appropriate label. In a way, this is the opposite of a concept used in many search engines that aims to find how common or important a word or phrase is to a document (see https://en.wikipedia.org/wiki/Tf%E2%80%93idf).

XGBoost Model in AWS-Sagemaker Fails with no error message

I'm trying to get a model using the XGBoost classifier in AWS-Sagemaker. I'm following the abalone example, but when I run it to build the training job it states InProgress 3 times and then just states Failed. Where do I go to find why it failed?
I've double checked the parameters and made sure the input and output files and directories in S3 were correct. I know there is permission to read and write because when setting up the data for train/validate/test I read and write to S3 with no problems.
print(status)
while status !='Completed' and status!='Failed':
time.sleep(60)
status = client.describe_training_job(TrainingJobName=job_name)['TrainingJobStatus']
print(status)
That is the code where the print statements come from. Is there something I can add to receive a better error message?
The problem occurred was that the file sent for predictions was csv but the XGBoost settings were set to receive libsvm.

R2WinBugs data entry incompatible copy error for Conditional Binomial likelihood, probit link, Random Effects (Psoriasis example)

I was working on the the guide for calculating the effect size of different treatments using a NETWORK META-ANALYSIS as done in example 6.a.
Here
It works fine in winBugs, but I want to do the analysis in R using R2winugs so that I can automate the data input.
It reads the model just fine according to the log, but it gets hung up when reading the data.
E.G.
display(log)
check(C:/Users/Temp User/.../Plaque_Psoriasis_Project/RE_Psoriasis.bug.txt)
model is syntactically correct
data(C:/Users/Temp User/.../Plaque_Psoriasis_Project/data.txt)
This is where the program just hangs.
The trap screen reads
incompatible copy
BugsCmds.TextError [000003A1H]
.beg INTEGER 1699018032
.end INTEGER 34825508
.......
I've tried reading the data as:
data <- list(t=t,C=C,r=r,n=n,na=na,nc=nc, ns=ns, nt=nt, Cmax=Cmax, meanA=meanA, precA=precA)
and as
data <- list("t","C","r","n","na","nc", "ns", "nt", "Cmax", "meanA", "precA")
Neither works.
I got R2winbugs to do the toy school example in the documentation so it can work.
Any thoughts?