Sohu Kan

Sohu Kan

  Sohu Kan was abbanded during 2013. The service lasted to 2015 mid.

  Images are not kept. Most of them are collected from internet. 
  So no images for some features.

Introduction

Sohu Kan is a service with a set of apps to collect and consume web content like articles, novels, images and so on by leveraging fragment time.

It supports iOS, Android, Web, Chrome(plugin) and Firefox(plugin). By simply clicking “share to Kan” in apps/plugins (for Web, provides a URL widget), user can quickly add web content to Kan collection. And Kan will crawl the web, extract the main content and push the content to your device.

User reads the article when taking bus, before sleeping or even having launch. The contents are extracted and reorganized in well format for good reading experience.

During that time, I was in charge and leading technology teams of whole product including architecture, design, development, testing, operation and so on.

Features

The functionally basically can be abstract into:

  • Add the link of web content to Kan (via different methods)
  • Kan crawls the content and make it clean to read.
  • Kan push the content back to all your devices

Images of apps:

But you know, we did many thing to make it:

  • easy to use
  • fast to collect & push
  • good view to read
  • convenience to manage
  • reduce your internet traffic fee
  • reduce the memory space usage
  • upload to App store, Google play and other 20~30 local Android app stores each release.

So the user can have the features 1:

Images of features introduction:

  • System integrated “Share to Kan” menu/button to share any content.
  • Extract main content (remove noise text, ad. and etc.)
    • articles (news, blog, etc.)
    • images (comic, gallery, etc.)
    • novels (auto download chapters. for some supported websites)
    • bbs (some supported)
    • those other tool can not download.
  • Realtime push and notification
  • Fast synchronization across platforms/devices
  • Read offline
  • Good experience for reading (optimized format, font, size, etc.)
  • Customizable reading view (dark, ligth, size, background, width, font, etc)
  • Integrated with some website to directly share from web page
  • Add URL from clipboard
  • Category & tag management
  • Image view
  • Share to fridends

We also provide guide for users.

Inside

Engineering

We maintain 2 releases (old, new) and upload to 20~30 app stores each week.

  • One release per week
  • 2 versions in parallel development/releasing
  • 4 client apps (iOS, Android, web, plugin for Chrome/Firefox)
  • 1 API service cluster for response, management, notification, push and etc.
  • 1 backend service cluster for crawling, text extraction and purification.
  • 1 MQ cluster for distribute tasks to workers
  • 1 DB cluster for article storage
  • 1 Object Storage service for images (simulate first, later integrated to Sohu Cloud)
  • 2 parallel version tests
  • Operations on servers, releases
  • git for release management
  • Admin console for monitor, alert, statistics and etc.
  • Quick release for internet events, holiday and etc. (different landing pages)

I setup and ran the process with 2 dev teams (8 developers, +2 interns later), 1 QA team (3 QAs) and 1 Ops to archive these successfully and smoothly.

Besides, used F5, Floating IP and domain, Object Storage on Sohu Cloud.

Images of App Stores 2:

Technical

Architect

Kan backend service is designed to support client apps and with flexibility to public the API for 3rd party app use.

The web service, which is the interactive layer, is designed as REST API. It’s behind the F5 load balance hardware. The IP is float and parking on the Nginx cluster. If a Nginx down, the IP will automatically park on the other. API cluster is behind the Nginx with round-robin (or session hash) policy.

RabbitMQ and MongoDB

REST API and Application service

to be written

Backend service

to be written

Clients

to be written

Operations

to be written

Operation

to be written

Design

to be written

Marketing

to be written

What if we can continue

Statistics

Images of first time statistics publication

Recommendation

to be written

Public the API

to be written

Reminder

to be written

Cluster analysis & Aggregation analysis

to be written

More

..

Media Reports

  Most media reports are in Chinese language.

We have invited many medias to try and report Kan. The following are some of the links that I could found now.


  1. The features introduction are recalled from my memory. Many features are not able to remember and write down. ^
  2. Image not found for Google Play, Firefox app/plugin stores. ^
Avatar
Samuel Chen
developer, architect

My interests include cloud as a software, distributed web system, mobile computing and programmable matter.