Rails 3.1: UrlEncode issues with periods (.) - ruby-on-rails-3

I have a site where when a user searches for an artist, song or album and click search, the search results are displayed. The individual search terms are then set to be clickable, meaning each use their own paths (or routes) to generate links.
The issue I keep running into is with random weird characters showing up in some of the artists, songs or album names (such as periods (.)). Is there anyway to url encode these?
Here is my current code:
<% artists[0..5].each do |art| %>
<li><span><strong><%= link_to "#{art}", artist_path(CGI::escape(art)) %></strong></span></li>
<% end %>

Assume you have an album name called "slash#^[]/=hi?qqq=123"
encoded = URI.escape str_to_be_encoded
encoded = URI.escape(str_to_be_encoded, Regexp.new("[^#{URI::PATTERN::UNRESERVED.gsub('.','')}]"))
The first one would encode to
"slash#%5E[]/=hi?qqq=123"
The second would encode to
"slash%40%5E%5B%5D%2F%3Dhi%3Fqqq%3D123"
What happens is that most url encoding methods would not escape characters that it thinks it part of a value url, so symbols like equal and question mark are not escaped.
The second method tells the escape function to also escape url-legal characters. So you get a better encoded string.
You can then append it to your url like
"http://localhost:3000/albums/1-#{encoded}"
Hope this helps.

Related

Display markdown safely as HTML in Vue3

So I have a set of strings, with some "custom markdown" that I have created. My intention is to render these strings as HTML in the frontend. Let's say, I have this string:
This is a string <color>that I need</color> to\nrender <caution>safely in the browser</caution>. This is some trailing text
I would be expecting to get something like:
This is a string <span class="primaryColor">that I need</span> to<br>render <div class="caution">safely in the browser</div>. This is some trailing text
And the way I do it right now is with some basic Regex:
toHtml = text
.replace(/<color>(.*)<\/color>/gim, "<span class='primaryColor'>$1</span>")
.replace(/\\n/g, "<br>")
.replace(/<caution>(.*)<\/caution>/gims, "<div class='caution'>$1</div>")
This works fine and returns the correct string. And then for printing, in the template I just:
<div id="container" v-html="result"></div>
My problem is that at some point I expect users to be able to enter this strings themselves, and that would be displayed to other users too. So for sure, I am gonna be vulnerable to XSS attacks.
Is there any alternative I can use to avoid this? I have been looking at https://github.com/Vannsl/vue-3-sanitize which looks like a good way of just allowing the div, span and br tags that I am using, and set the allowed attributes to be only class for all the tags. Would this be safe enough? Is there something else I should do?
In that case, I believe it will not be necessary to sanitize it in the backend too, right? Meaning, there will be no way for the web browser to execut malicious code, even if the string in the server contains <script>malicious code</script>, right?
My problem is that at some point I expect users to be able to enter this strings themselves
So, Do we have a form input for the users to enter the string which you mentioned in the post ? If Yes, My suggestion is that you can sanitize the user input at first place before passing to the backend. So that in backend itself no malicious code should be stored.
Hence, By using string.replace() method. You can first replace the malicious tags for ex. <script>, <a, etc. from the input string and then store that in a database.
Steps you can follow :
Create a blacklist variable which will contain the regex of non-allowed characters/strings.
By using string.replace(), replace all the occurrence of the characters available in the string as per the blacklist regex with the empty string.
Store the sanitized string in database.
So that, You will not get worried about the string coming from backend and you can bind that via v-html without any harm.

HTML not rendering through EJS

so basically I have a bunch of HTML strings in a MySQL table and I am trying to display then through EJS.
For instance, I have a string that looks like this is a link with some <code>code</code> next to it. In my code I try to display it in that way.
<%- listOfStrings["myString"] -%>
However, as you probably guessed when reading the title, the string seems to be escaped when displaying on the screen.
What's even weirder to me is that I have two tables with such strings, and it works for the first one, while it doesn't for the second one. One difference though, is that the first one is hardcoded, while the second one can be edited through some tool on my website. Encoding is utf32_unicode_ci for both tables, if that matters.
For debugging purposes I tried to store the aforementioned strings in a js variable and display them in the console: then it seems like <and > characters are all escaped for some reason. Is there an explanation to this behavior, and if so how to fix it so that HTML renders correctly?
Thanks for your help!
You can try it :
<%=listOfStrings["myString"]%>

Formatting [] array like params Rails

here my view code:
- #categories.each do |c|
%li.menu-drop
= check_box_tag "categories[]", c.id
.tipo-infor
= c.title
when i submit this form, it generate this url:
http://0.0.0.0:3000/buscar-projetos-com?utf8=%E2%9C%93&categories%5B%5D=3&categories%5B%5D=4&commit=Pesquisar
my question is how to better display this url just like:
http://0.0.0.0:3000/buscar-projetos-com?categories=2,3,4,5
Those %E2's and such are the ASCII representation of those URLs and it is much safer to use those when sending data over the internet. Furthermore commas (,) are not legal in URLs, so they should never be used as neither browsers nor standard libraries will like them.
If you are concerned about "ugly" URL strings, you could submit your form using the POST method vs GET, which will hide all params from the user.

How can I use a permalink containing a slash in a named route in Rails 3?

I followed Ryan Bates screencast of how to use permalinks in a Rails application. Unfortunately I am stuck with an issue when some of my permalinks contain slashes. Is there anything that I can do in the controller to encode those on the fly, or do they need to be encoded in the database?
You can use Rack::Utils.escape to return a clean, friendly URI. For instance:
Rack::Utils.escape("This/is/not/a/good/url")
will return
"This%2Fis%2Fnot%2Fa%2Fgood%2Furl"
and
Rack::Utils.unescape("This%2Fis%2Fnot%2Fa%2Fgood%2Furl")
converts it back to the original string:
"This%2Fis%2Fnot%2Fa%2Fgood%2Furl"
You'll have to wire those methods into the find methods in the controller, but should work out for you.
To generate permalinks that are safe, use something like this. It will create a 4 character long, url safe permalink and check to make sure there are no duplicates.
def create_permalink
loop do
self.permalink = SecureRandom.urlsafe_base64(4).downcase
break permalink unless ModelName.find_by_permalink(permalink)
end
end

What is the maximum number of url parameters that can be added to the exclusion list for google analytics

I set up a profile for Google Analytics. I have several dozen url parameters that various pages use and I want to exclude. Luckily, google has a field you can modify under the general profile settings [Exclude URL Query Parameters:]. Of the several dozen items I have they are all working, and not being considered part of the URL. Except for the parameter propid
I added propid to the comma separated list on Monday. But, everyday when I check GA, sure enough they are coming through with that parameter still attached.
So, am I trying to exclude too many parameters? I couldn't find any documentation on GA's site to say there was a limit.
here is the exact content of the exclude URL Query parameter field
There reason there are so many is the bh before me didn't know the difference between get/post.
propid,account,pp,kw1,kw2,kw3,sortby,page,msg,sd,ed,ea,ec,sc,subname,subcode,sa,qc,type,code,propid,acct,minbr,maxbr,minfb,maxfb,minhb,maxhb,minrm,maxrm,minst,maxst,minun,maxun,minyb,maxyb,minla,maxla,minba,maxba,minuc,maxuc,card,print,year,type
update
I thought after more time had passed the "bad data" would fall of of GA. But as of yesterday it is still reporting on the propid querystring value despite adding that as well as other variables to the exclude list.
update2
I found this post on google https://www.google.com/support/forum/p/Google+Analytics/thread?tid=72de4afc7b734c4e&hl=en
It reads that the field only allows 255 char, Ok. Problem Solved. Except my field of values is only 247 charcters.. ARGGGHH!
*Update 3 *
So Here is the code I've added to the googleAnalytics.asp include page that goes at the top of everyone of my asp classic pages. Can anyone see a flaw in the design? I don't care about ANY query string info. (it could have been named *.inc, but I like having intellisense working)
<script type="text/javascript">
<% GAPageDisplayName = REQUEST.ServerVariables("PATH_INFO") %>
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-20842347-1']);
_gaq.push(['_setDomainName', '.sc-pa.com']);
<% if GAPageDisplayName <> "" then %>
_gaq.push(['_trackPageview','<%=GAPageDisplayName %>']);
<% else %>
_gaq.push(['_trackPageview']);
<% end if %>
(function () {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
Update 4
I'll only accept an answer if you will include something talking to the original question. My question was very specific, I wanted to know exactly the number of characters google allows. Everything I included in my original question body was simply to backfill the question to put everything in context.
Might I suggest an alternate solution to the reliance on manually excluding all of these (and feasibly any string ever used)?
I'd suggest passing a parameter to the trackPageView function to 'force' the recording of a manually/programatically set 'page name' value.
Whereas by default, GA records/defines a page based on a unique URL, the inclusion of a pagename parameter would associate all pageviews of a page with that parameter as pageviews to a single page.
For example, standard GA pageview code looks like this: _gaq.push(['_trackPageview']);, whereas the inclusion of a specific page name looks like this: _gaq.push(['_trackPageview', 'Homepage']);. With the latter, presuming that the homepage is at www.site.com, regardless of how that page is accessed GA will always consolidate all pageview stats for it as 'Homepage'. So, www.site.com/index.php, www.site.com/?a=b and www.site.com/?1=2&x=y will always report as 'Homepage' as if it was one page.
The only drawback here is that you need to be incredibly careful around any occurences of pagination, nested pages, content swapping, site search, or any functionality which may in fact rely on the use of query strings; you may need to consider some logic on how the page name values are output, rather than attempting to define on a per-page basis depending on the site of your site(s).
Hope that's helpful!
Do you realize that you have propid listed twice in the exclusion field? Once at the beginning and then again about one-third of the way through. That's the only thing that stands out to me. See what happens if you remove either of these.
You also have type duplicated, so if the above fixes the problem for propid, also consider removing the second type.
Google limits the characters in the "Exclude Url Query" field (2048 characters max), not the number of queries. I had the same issue you're having and what I discovered was that I had populated my query string parameter list based on the pagenames in my pages report. Well those pagenames first pass through a view-level lowercase filter that I have set up. And since the "Exclude URL Query" field is case sensitive, some of the parameters were getting through. Hopefully this helps.