Tuesday, July 24, 2007

Cimple Project for Community Information Management talk

Anhai is giving his talk now, the project deals with community information management. There are numerous online communities, with each community having many data sources and many members, such as the movie fan community. Members often want to find information about what is new in the community, the connections between members in the community, and the topics in the community. The whole idea is to create structured data portals via extraction, integration and collaboration of information sources. The extraction of information from the data sources will create entities which then connections between entities are inferred, by creating a graph.

Building a structured portal semi-automatically like Citeseer is not new. Prior work involves collecting a large number of data sources, and then using machine learning techniques. With their approach, they just choose a small set of data sources and do a compositional and incremental approach. To populate the portal, they choose the top 20% of data sources which generate 80% of the rest of the community. They create a prototype of their system called DBLife for the database research community. The 20% of top data sources become a seed for building the portal community. A plan is then created to generate a daily ER graph, they first find entities and then find relations. How to find these entities? They first find entities within the data sources, and then match with other similar ones. For example, my name is Alvin Chin but also my name could also be "Chin Alvin", so the two are related and are matched to the same entity. They also generate variations of names. This technique works well for the majority of cases. Of course there are other cases where this doesn't work like for example Asian names. In this case, they apply a stricter matching approach particular to those cases.

The next step is to determine the co-occurence relations between entities. They also create a plan to find label relations. For addressing the expansion, they look at the nodes of members within the community and crawl those to expand the tree. They then enlist the users who go to the community portal that allow them to edit information in a wiki-style format. Right now, they don't incorporate the changes back into the structured database, but those are plans for future research.

It's interesting that he said that the decisions and research that his group has done has worked very well. But I suspect this is because they've been able to select the right data sources, so the data is clean and there already is a community well defined on the web for the database research community, therefore their technique works well. One of the things where they haven't addresses (one of many) are the capture and extraction of social interactions. This is where my PhD research can help.

All in all, I felt it was a good talk, and shows the potential of research in web communities.

On Technorati: ,

Monday, July 23, 2007

Community information management talk tomorrow

There's a database research seminar talk tomorrow at 2 pm in BA 5256 at U of T on Cimple Project on Community Information Management presented by AnHai Doan from University of Wisconsin-Madison and Yahoo Research. Should be an interesting talk since they are dealing with the data management of online communities, and my PhD research is dealing with finding community in social hypertext environments.

They have a project web page here.

On Technorati: , , ,

Friday, July 20, 2007

iPhone is in Canada now!



Yes, I just saw it on sale at G4Tech kiosk in Oakville Place. I actually touched an iPhone, and it sent shivers down my spine. It was a locked phone, so I couldn't see all the applications like YouTube and Safari browser. But I will go back next week to see the unlocked iPhone. It's apparently going for $1100! Ouch, nope, not going to get it, but when I see and touch the unlocked iPhone, I will take pics for sure!

On Technorati: ,

StatCounter wins BusinessWeek's Europe's best young entrepreneur

A while ago, there was a poll to find Europe's best young entrepreneur conducted by Business Week, where I wrote a blog post calling people to vote for StatCounter. For those who don't know about StatCounter, it is a free statistics analytic tool for tracking web sites. I use it for my research group's blog, this blog, and my Windows Live Spaces blog, and I like it.

Well, the votes are in, and Aodhan Cullen who is the creator of StatCounter has won Europe's best young entrepreneur. He started StatCounter when he was 16 and at 24, he's still going strong. Google Analytics is a competitor to StatCounter, but Cullen doesn't mind and feels that StatCounter is different in some respects to Google Analytics.

Way to go Aodhan and StatCounter!

On Technorati: , ,

Thursday, July 19, 2007

Mashups and Twitterlicious

Mashups are the craze these days, and this week was Mashup Camp in Mountain View. What's a mashup? Basically, it's a fusion of different data sources to create a web application using public APIs. Mashups before were started by the geeky developers. The first type of mashups were those based on Google Maps API, like for example, there is a mashup of TTC stations that are mapped in Toronto. In a way, just like open-source software where people tinker with the code and modify it, mashups are in a sense open-web applications, they really open up the web, and the sky is the limit as to what type of applications you can write.

Now apparently, the big businesses are getting into the game, especially the big three of Google, Yahoo and Microsoft. Google has something called Google Mashups Editor that allow you to create mashups visually in a GUI interface on the web. It's in beta and right now is only limited to a small number of developers. Ever since Google started with Google Maps, they've been speedily enabling desktop applications on the web, like Google Docs and Spreadsheets and Gmail. Yahoo also has a mashup tool called Yahoo Pipes, check out my blog entry about that. Of course, it was inevitable that Microsoft would have a mashup tool (they always come late into the game in almost every new product, but in the end do give a run on the competition). Their mashup tool is something called Microsoft Popfly.

What is Popfly? It's a catchy and cool word. Basically Popfly is a way for non-developers to easily create web applications without having to write code. It's only by invitation only as it's in Alpha. You have to click a button to join and then if you're accepted, you'll get an e-mail back to allow you to login. I did that last week. I haven't really went through the time to test Popfly, being busy with the PhD and finishing writing up the camera-ready of a conference paper. Hopefully, I'll be able to test drive Popfly.

So anyways, back to MashupCamp. Here's a pretty neat mashup, it's called Twitterlicious and as you might guess it uses Twitter and del.icio.us. What it does is that you can browse Twitter feeds on your phone but since it's difficult to browse URLs associated with Twitter posts, you can create clips of the URLs as del.icio.us bookmarks which you can view later on your PC. A video of this is shown here from MashupCamp.

On Technorati: , , , , ,

My classmate featured in MIT Technology Review!

I love to read MIT's Technology Review magazine and to see the research and industry articles on the latest technology. My classmate Shengdong Zhao at the Interactive Media Lab has his PhD research featured in MIT Technology Review. His research is looking at creating an audio interface for mobile devices like the iPod which is eyes-free. I know I could use something like this with my iPod when I'm walking to school, especially in winter, where I don't want to have to take out the iPod and then change the tracks, I just want to keep the iPod in my pocket and just use the wheel to move in a circular fashion to select the song that I want.

On Technorati: ,

Tuesday, July 17, 2007

Life after PhD

I'm right now finishing up the PhD thesis and will be in the market for a full time job next year either in academia or industry. During my job searching, I came across this interesting graph below that comes from the 2006-7 Taulbee survey:



There is a dramatic increase in the number of new PhDs. And this trend is going to increase into 2007 in the US. What does this mean? There is a huge supply of PhDs looking for work, but unfortunately only a limited number of job openings. Huge supply but low demand. From the statistics, more PhDs are getting into industry than academia. You can also see the salaries of the professors at various US universities. How does that compare to Canada? Here's the 2003-4 stats of Canadian professor salaries as compiled by Statistics Canada.

Interesting trends and information to keep in mind when applying for academic positions for next year.

On Technorati: , ,

Thursday, July 12, 2007

Jakob Nielsen against writing blog posts

Taking a break from research, I've been reading up on some of the blogs that I subscribe to from my Bloglines. I came across this blog post from AccordionGuy (Joey DeVilla) who's been a speaker at the CASCON workshops on Business of Blogging and Social Computing: Best Practices, which I chaired. It's about Jakob Nielsen and his case for writing articles, and not blog posts. If you don't know Jakob Nielsen, he's famous for his usability guidelines and his usability heuristics for creating user interfaces.

In this article, it's interesting that Nielsen says that writing articles are more world-class than writing blog posts. Specifically on the top of his article, he writes:

To demonstrate world-class expertise, avoid quickly written, shallow postings. Instead, invest your time in thorough, value-added content that attracts paying customers.

He feels that blogs defeat that purpose because people rant and just write stuff off the top of their head, it's not well organized and thought through (like I'm doing right now). Therefore, blogs are not very credible, is what I think he's trying to say. Off course, this brought the king of blogging himself Robert Scoble into the conversation with his counterpoint "Jakob Nielsen says “don’t be like Scoble”" , which caused a huge discussion in the blogosphere. Robert Scoble says the following:

1. Don’t do quick posts like Scoble.
2. Don’t risk being an idiot like Scoble.
3. Don’t put comments on your idiocy like Scoble.
4. Don’t link to other idiots like Scoble.
5. If you want to seem like you know something, unlike Scoble, write long ass white papers with lots of charts.
6. Don’t have fun like that idiot Scoble.
7. Don’t you dare put pictures of cats or babies or other personal details up like Scoble does.
8. Don’t add Web 2.0 mechanisms to your Web site like Scoble does. Definitely no “del.icio.us” or “Digg” voting graphics.
9. Don’t get caught dead inside an Apple store like Scoble does.
10. Don’t give Fake Steve or Valleywag a reason to deride you like Scoble does.
11. Definitely don’t get close to Twitter/Jaiku/Pownce/Facebook like Scoble does. If you can say it in 140 characters you shouldn’t say it at all.


Of course, that is Scoble's interpretation. There are some that agreed with him, saying that Nielsen doesn't really understand what blogging is and the impact that blogging can become (there's many examples of this and how credible a source that blogging can be against traditional media). For more information, you can read Scoble and Israel's Naked Conversations book which I highly recommend reading. I've posted my own thoughts from the book if you're interested. Then there's others, who've felt that Scoble is being egotistical of himself especially saying that Nielsen says do not be like Scoble and that Nielsen was making an attack on Scoble. In Nielsen's article, Nielsen does not even mention Scoble. However, I think what Scoble is trying to say is that Nielsen is against the practices that what Scoble does in blogging, not against Scoble himself. I don't think Scoble intended to say that Nielsen was attacking him, just his behaviour and his beliefs indirectly.

Of course, having this controversy, has caused 88 comments on Scoble's blog post, so lots of interesting conversations happening. In my own opinion, I believe that blogging and writing long thought-out articles, both have a place on the web and that each serve its own purpose. There may be times for example where you want to write something to an audience in a professional manner to make it scholarly and intelligent. Then there may be times, where you want to just start conversation and bring about your own thoughts (like I'm doing here), but it's not meant to be something like a news article. Some people don't have the time to read long articles (I for sure am not with my busy day!), and enjoy just the short blog entries (sorry this blog post I'm writing is not short!). I think one thing that Nielsen misses is the impact of what blogging can do with the linking behaviour. It's powerful, it makes you get noticed, it's self publishing and self-advertisement. Look at Technorati and del.icio.us where you can find conversations and trends on various topics, and search through blogs. So powerful, that I'm studying blogging to find communities from the social networks that get created through the blog links.

On Technorati: , , ,

Monday, July 09, 2007

Twittering from the phone

This Twitter thing is really getting pretty hot. If you don't know what Twitter is, read my post here. Now, there is an interface called Twittergram that takes an audio clip and then posts it to Twitter. It's kind of a voice version of Twitter according to this Webware article, where you submit your Twitter account info, then the MP3 file, and then Twittergram will create a Twitter post to a link to the MP3 file. Twittergram was created by Dave Winer, who is the creator of RSS and blogging, where the first blog was considered Scripting News.

Here are some comments on Twittergram from Dave's blog. I guess some people really have too much time on their hands and want to update everything that they do at certain times during the day. There's even now a phone interface where you can send the audio clip if you want to Twitter on the go. I guess this is kind of like a mobile mini podcast. It would be interesting to see if there is some type of community that exists within the Twitter posts.

On Technorati: ,

Monday, July 02, 2007

Under the hood of the Apple iPhone

It was just a matter of minutes before the first pics of the inside of the Apple iPhone was revealed. These are really the true engineers and geeks! Of course, by doing so, they voided the Apple warranty, and some even made the iPhone unable to use. What a waste of $500 US to void the Apple warranty.

Anyways, I found pics of the inside of the Apple iPhone from ThinkSecret from the box, to opening the box and revealing the accessories, to taking apart the iPhone. They reveal the secret components that make the Apple iPhone work. Interesting. And also, Steve Jobs announced that all Apple full time employees get a 8GB iPhone for free! Man, that makes me wish I worked at Apple!!!

On Technorati: ,

Camera ready of Hypertext paper submitted and CASCON short paper accepted!

I just submitted the camera ready version of the short paper for the Hypertext conference called Identifying Subcommunities Using Cohesive Subgroups in Social Hypertext. It was a challenge to take the originally submitted 8 page full paper and cut it down to 4 pages, but I believe it's more focussed. The Hypertext conference is in Manchester, UK and goes from September 10 to 12. I will post the paper just right after the conference.

On another note, I also got another paper accepted to the CASCON conference in October in Toronto. It's also a short paper and I'm right now working on the camera ready version. So look for both papers towards September, as well as podcasts and slides from my talks. They will be nice additions of case studies to my PhD thesis.

On Technorati: ,

Happy Canada Day!



Today is Canada's 140th birthday since Canada became a nation on July 1, 1867. Happy birthday Canada! From the Great Canadian Wish List that CBC did on Facebook, here are the top 30 wishes voted by Facebook members. The top wish is abolish abortion.

I hope everyone had a safe and happy Canada Day! We are truly blessed to live in such a diverse, multicultural, and free country that is Canada. Bonne fete Canada, vive le Canada!

Happy birthday Canada!

On Technorati: , ,