Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
Tags
Deleted post

Didn't expect your comment to impede my comment scrubbing efforts. Why oh why are so many people using new lines in their comments? Don't you know that messes with excel?

Anyway random passerby hi!

Deleted post

I have decided to collect all jam related comments and calculate the "comment uniqueness" for each participant based on how often they use the same words. Currently I have saved all comments that where made between the beginning of the jam to about Friday in a mysql database, but I would like to save them in excel to make all the fancy graphs and show off the data set for anyone interested.

You are correct in your assessment. It sholdn't be hard, in fact there is a special command that is supposed to replace parts of the string with parts of a nother (string.replace("something to replace", "something to replace it with")). Unfortunately it just doesn't work for me. It detects the new line and adds (in my case) a space character, but dose not delete the the new line character.

Also I don't want to fix the excel table by hand, since there are about 200 odd lines and they will rise, when I update my data set.

Deleted post

You have made a good point. I also think that ignoring comments on your own rating page should be a thing and luckily it is a very simple query to filter them out. Basically, since I also collected the relation between the game id and the author I can write something that boils down to: "make a list of all comments by this person that where not on his/her game". That is the magic of databases.

I was working on the filtering problem. The problems I am having might stem from UTF-8 encoded characters, but I am not sure yet and need some more testing. In any case it's probably me doing something incredibly dumb and it simply takes a tad for me to find out what that is.

I would prefer to keep the text comment the same and work with the original, since some of them are not written with Latin characters. I can't just throw away whole comments just because they are written in a different language. I will probably even drop the spellcheck, since it is slow, limits the language to American English and doesn't really improve anything.