04/2017 Uber Senior software engineer offer 435K annually before ipo

Uber Level: SDE2
Education: College
Location: San Francisco

Experience: >5 year
Competing Offer: NA
Base: 170K
Bonus: at least 40K (performance based)
Equity (over 4 years): 18K share (about $50 per share)
Sign on bonus: NA
Expected income: 435K/yr

Analysis: another great offer for college student, why did we get post-graduate degree anyway?

04/2017 Uber Fresh PhD Offer SDE2 about 240K per year

Uber Level: SDE2
Education: PhD
Location: San Francisco

Experience: Fresh PhD
Competing Offer: Microsoft
Base: 135K
Bonus: 40K (performance based)
Equity (over 4 years): 250K worth
Sign on bonus: NA
Expected income: 237K/yr

Analysis: significantly smaller package compared to 2016’s fresh PhD offer

Design question: News Feeds

Question

To make it simple, let’s focus on designing news feed system for Facebook since different products have different requirements.

To briefly summarize the feature, when users go to their home pages, they will see updates from their friends based on particular order. Feeds can contain images, videos or just text and a user can have a large number of friends.

So how can you design such news feed system from scratch?

 

Subproblems

If you haven’t thought about this problem, it’s better to solve it by yourself before reading the rest of the post. Although there’s no such a thing as standard answer, you can still learn a lot by comparing your solution with others.

Here we go. As we said before, when facing such large and vague system design question, it’s better to have some high-level ideas by dividing the big problem into subproblem.

For a news feed system, apparently we can divide it into front-end and backend. I’ll skip the front-end as it’s not that common in system design interviews. For backend, three subproblems seem critical to me:

  • Data model. We need some schema to store user and feed object. More importantly, there are lots of trade-offs when we try to optimize the system on read/write. I’ll explain in details next.
  • Feed ranking. Facebook is doing more than ranking chronologically.
  • Feed publishing. Publishing can be trivial when there’re only few hundreds of users. But it can be costly when there are millions or even billions of users. So there’s a scale problem here.

 

Data model

There are two basic objects: user and feed. For user object, we can store userID, name, registration date and so on so forth. And for feed object, there are feedId, feedType, content, metadata etc., which should support images and videos as well.

If we are using a relational database, we also need to model two relations: user-feed relation and friend relation. The former is pretty straightforward. We can create a user-feed table that stores userID and corresponding feedID. For a single user, it can contain multiple entries if he has published many feeds.

For friend relation, adjacency list is one of the most common approaches. If we see all the users as nodes in a giant graph, edges that connect nodes denote friend relation. We can use a friend table that contains two userIDs in each entry to model the edge (friend relation). By doing this, most operations are quite convenient like fetch all friends of a user, check if two people are friends.

 

Data model – continue

In the design above, let’s see what happens when we fetch feeds from all friends of a user.

The system will first get all userIDs of friends from friend table. Then it fetches all feedIDs for each friend from user-feed table. Finally, feed content is fetched based on feedID from feed table. You can see that we need to perform 3 joins, which can affect performance.

A common optimization is to store feed content together with feedID in user-feed table so that we don’t need to join the feed table any more. This approach is called denormalization, which means by adding redundant data, we can optimize the read performance (reducing the number of joins).

The disadvantages are obvious:

  • Data redundancy. We are storing redundant data, which occupies storage space (classic time-space trade-off).
  • Data consistency. Whenever we update a feed, we need to update both feed table and user-feed table. Otherwise, there is data inconsistency. This increases the complexity of the system.

Remember that there’s no one approach always better than the other (normalization vs denormalization). It’s a matter of whether you want to optimize for read or write.

 

Ranking

The most straightforward way to rank feeds is by the time it was created. Obviously, Facebook is doing more than that. “Important” feeds are ranked on top.

Before jumping to the ranking algorithm, I’d usually like to ask why do we want to change the ranking? How do we evaluate whether the new ranking algorithm is better? It’s definitely impressive if candidates come up with these questions by themselves.

The reason to have better ranking is not that this seems the right thing to do. Instead, everything should happen for a reason. Let’s say there are several core metrics we care about, e.g. users stickiness, retention, ads revenue etc.. A better ranking system can significantly improve these metrics potentially, which also answers how to evaluate if we are making progress.

So back to the question – how should we rank feeds? A common strategy is to calculate a feed score based on various features and rank feeds by its score, which is one of the most common approaches for all ranking problems.

More specifically, we can select several features that are mostly relevant to the importance of the feed, e.g. share/like/comments numbers, time of the update, whether the feed has images/videos etc.. And then, a score can be computed by these features, maybe a linear combination. This is usually enough for a naive ranking system.

 

Design question: POI

FB/Uber likes this one.

A point of interest, or POI, is a specific point location that someone may find useful or interesting.
Read More on wiki

Q1. Given the current location, how to find the most closest k points.
Q2. Given the current location, how to find all the points within k miles. 

A1. Geohash
A2. K-D tree
A3. Space-filling Curve 

intersection of intervals — search points in a range.
intersection of rectangles (using bst)   search overlapped intervals
https://www.youtube.com/watch?v=Igr6yONkpIQ

又跪了,Facebook跳到各大公司的offer包裹(Uber,Google,Snapchat,Dropox)

人比人气死人啊,算了什么都不说了,直接上数据吧
背景:CS Master 5.5年

地点:湾区

现在在Facebook

最近拿了几个offer, 请大家看看给个建议
大概annual pkg 情况如下

Google T5 : 350k
(Master + 5.5年 拿到 G T5 感觉挺满意的了, 知道不少朋友没要到T5,最后从了高端
T4)

Uber Senior: 410k
(60B的估值不知道啥时候revenue才能justify, 短期看不到套现的可能???)

Snapchat: 420k (注意了,vest plan其实是10%,20%,30%,40%, 这里按每年25%算的,有
点坑!)
(Vest plan有点坑,会不会干到第三年就被fire了???)

Dropbox: 430k
(2017年IPO?? http://www.bloomberg.com/news/articles/2016-08-15/dropbox-said-to-discuss-possible-2017-ipo-in-talks-with-advisers)

都已经negotiate过了(except snapchat, no bargain at all)

Uber latest interview questions 2016

去年投入重金的Uber市场份额不增反跌,从一季度的10.9%跌至四季度的8.7%… but anyway
1. Merge Two Sorted Lists
2. Sparse Matrix Multiplication
3. 给一个n列类似俄罗期方块的盘, 往下掉方块. 方块定义如下:
class Block {
int left;
int right;
int height;
}
其中 0<= left < right < n ,  像俄罗斯方块一样会叠起来. 求最高高度

class FallingBlock {
public FallingBlock(int width);
public void fallBlock(Block block);
public int getMaxHeight();
}
4. design whatsapp

Uber 2016 SWE2 offer package numbers salary bonus refreshment

Uber最近不怎么给Senior了,说说他们的SWE2的Offer吧:我拿了u的offer,negotiate过,也帮助其他拿u家offer的朋友negotiate过,各种渠道
了解到的近期swe2的offer不低于5-6个例子。

swe2的range是不小,也就是14w+ base,25w-70+w equity的样子。高端swe2的offer:
14 + 70/4 也就是30出头些。

要不就是把top performer的refresh也给无耻的算进去。你问问拿FLG offer的,那
Airbnb, Pinterest offer的,有谁家把refresh算进去。
一月底的数字是 Bonus $37K (on target) $111K (top 25%) 222K (top)
最近据说改了比例后,20% cash, 80% equity,但是数字我想应该没有大的变化。

问题是没有哪家把refresh算进package里面,算refresh就是耍无赖,还要按top
performer算那就是耍流氓了。别人家的refresh target都差不多是给new hire equity
grant的1/x,x=4~5左右,比起来U家的37k就差远了。

骑马找宝马攻略:已经在FANG(不说FLG了),要百万只能往独角兽跳了

要跳趁早,时机不等人。

我的背景:西雅图地区的谷歌,五年多经验,主要做backend。最近一两年,身边朋友纷纷跳槽新痒痒。现在比较后悔的就是,早知道两年前刚拿到卡就该挪一挪,拖到现在算有点晚了。另外一方面,在G干的活,现在也越来越提不起兴趣。朋友的怂恿和激励下,去年10月终于下决心要跳了。

社招/ experienced hire,刷题到一定程度就够了,其他的知识积淀还是更重要

当时也没想要跳湾区,差不多就是一心想去打车公司UBER的西雅图分店。其实去年夏天就有些蠢蠢欲动了,刷了几道题后懒了又不了了之。10月份开始认真刷lc,刷的也不快,到12月才勉强刷一遍。后来回想,浪费很多时间,其实各个种类挑着做50~100道应该就差不多了。然后花了很多时间精力去复习系统相关的知识。G家自己的用过的infra复习下,spanner没用过,正好跟新project沾点边就看了个大概。有些东西像chubby, pubsub用过但是内部完全不懂,趁这个机会也翻翻人家的design doc有点大致的了解。当然最后我觉得也没有真正派上太大的用,不过做为知识积淀也挺好。然后就是市面上的技术
我是完全没接触过,起初还很担心,不过学了一圈下来也觉得没啥高大上的,大多能在G里面找到类似的,而且比起G做的更简化。这些花的时候不必刷题少,而且design doc/tech report/paper这些读起来可没有做题那么有趣。

废话一堆之后,来聊聊面试经历吧。如果你是来找算法题,可能要失望了。忘了有没有签nda,不过遇上很多国人interviewer,慎重起见我尽量模糊化具体的面试题。其实我说了也没用,真的,更重要的东西其实是在交流上。

悲剧无所谓:面最心仪的公司前练练手也很有必要。

12月朋友催我说打车公司又要融了以后pay的越来越少要来赶快啊,我总觉得没准备充分犹豫了一阵,月底才鼓起勇气让朋友递了简历。对拼趣一直也挺感兴趣,也让朋友帮递了,不过说实话,当时也就是想试试而已。然后顺手找人帮投了个脸书家,想拿来练手。听说facebook考刷题比较多,我自以为擅长做题。加上我背景里面social graph,  infra, product都沾点边,去面之前有种offer手到擒来的感觉。结果就悲剧了,怎么说呢,也不算是被黑,发挥的也不好,有些很弱的失误回家路上就意识到了。算法题基本上都是lc上的,有一道是hard但是那种非常经典大家都会做的,其他都是medium水平的题,一共涉及了binary tree, stack, backtracking, prefix tree这些知识点。系统题是让设计一个code search系统,基本上就是先装模作样分析估算下,然后画画大的框架,反正差不多就是凭着经验和感觉走,然后接下来就是interviewer提问,对某些component或者某些具体的情况zoom in进去讨论。虽然search我没做过,indexing系统还是稍微接触过的,但是时间久了忘了不少,回家后又正好补了下知识。

等脸书结果期间面了两轮u的店面,两轮都是很nice的中国人,跟第一位大哥中文聊天也聊得非常愉快,coding题目也不难,用queue就能解决,大概也是放水吧。第二位系统设计也是国人,问的google map,当时也没怎么准备过geospatial方面的话题,我觉得磕磕碰碰的,结果还是承蒙面试官放水给过了。P家店面又是中国人,运气很好,问了中等难度的lc题,就给水过了。不过有意思的是,这两家的coding都是online写完编译调试,像我这样经常犯些typo或者弱智失误的,调试能力就可以弥补一些粗心,啪啪啪的很快改完跑通,大概也给interviewer留下确实能干活的印象吧。不过坏处是如果
一两分钟没调出来,压力瞬间爆棚,只能扛着了。
赞:其实大部分国人还是很nice的,遇上是缘分和运气!

这个时候fb悲剧的消息到了,感觉信心很受挫,情绪比较低落,本来觉得十拿九稳的事情都黄了。其实现在想想悲剧是好事,让我带着卑微的心态努力尽力的准备之后的面试。

NERD也要能侃:面试中交流聊天非常重要,我觉得不亚于做题写码的重要性。
打车公司的onsite是在三番,虽然我申的职位是在西雅图。第一轮是老美mgr,名义是考behavior,其实就是聊天,没有什么奇怪的问题。我准备的也比较充分,比较放松,吹吹自己做项目的经历,侃侃对他家美好前景的向往,大家聊的也很愉快。第二轮是设计题,他家的几道经典设计题目之一,设计netflix。还是先需求功能分析,然后画大框架结构,然后主要问了下,serve media file怎么做到high available, high throughput,这方面其实不太懂,这个时候就只有借助知识储备开始瞎扯,一会儿瞎扯些分布文件系统的东西,一会儿又瞎扯些backup requests,parallel read等方案,然后上面的caching层再扯几句。接着又继续问了recommendation系统,时间不多,只能大致提了下user-based/item-base CF这些。其实几年前粗略的看过一些netflix做推荐的资料,马马虎虎应付一下还凑合。总的感觉还是聊的比较愉快,交流上基本上还是比较合拍。接下来一轮,我现在还有点摸不着头脑的感觉,很open的problem solving,说是design但又不是system design,大概就是主题公园排队时提供fast track,比如,交5块钱,告诉你一个小时后回来,有点像scheduling系统。最后还让写code简单模拟一下。我稀里糊涂的都忘了怎么答的,感觉答的如何心里很没谱,最后居然也还是给过了,也许是我东扯西扯一堆,擦着边击中了面试官心里想听的点子上?接下来一轮是coding,简单的有点莫名其妙,其实后来听了不少别人的面经,U家问简单coding题似乎是很正常的!不过然后不停的followup,如果这个是正式的code,unit test你怎么写,让你自己做code review,有哪些你会改的,怎么refactor?感觉是在考察实际工作中写码的能力,其实也make sense,毕竟工作里面是没有机会写太fancy的算法。不过我觉得这样面,有工作经验的人写码多的人,尤其是从像g这样code review严的地方出来的,应该都能pass才对。最后一个人又是聊天,大概聊了一半时间后,顺带着引出一个系统设计问题,也是经典的高频题,就是让设计他家的打车系统里面的一个feature,轻松搞定,走人。

总的来说,一大半时间感觉都是聊天,扯,吹牛。他家也特别看重culture fit,就是你要有passion,要有ownership,做事快,take risk。我觉得这些都是靠聊天里面慢慢透露出来的信息,不是说简单直接了当的问。当然了,认真准备culture fit我觉得是非常有必要的,其实技术上的水平和背景经历,面试之前基本上就是定下来了的,而culture fit是可以通过认真准备更充分的体现自己的fit。对了,每个interviewer必问一次为什么想来U家,我都快能背下我的答案了,最后一个人问的时候,我就明给他说, 前面问过很多次了,I’ll try to answer this in a different way,然后就即兴了。

拼了去了: 有针对性准备充分,也许会胜过广撒网批发面

一个星期后就是拼趣的面试。这一个星期内主要的功夫是花时间用他家的产品,做功课,产品功能,business model,并且想想哪些是做的很好的,哪些地方可以提高怎么提高。然后拼趣家的四点文化,认真想下交谈中怎么结合自己的经历能体现出来,对于有经验的人,我想这些下功夫都是能做好的。虽然我准备了这些,但是最后其实很多准备的东西都没有机会用上,不过至少还是让我有足够的信心去和面试官交流。拼趣的中国人非常多,更难得的是,中国人都很抱团很友善,有三轮都是中国人面试官。因为准备加入拼趣,面试题就不详细写了。

俗人说钱

打车公司最先给offer的,但是包裹一开始压的非常低,base跟现在差不多持平还略低一点,只有$43w股票,基本上包裹就是跟现在持平,很失望。直到我有了拼趣的offer,才追上来,谈到68w也谈不动了。

拼趣很快给了口头offer但是各种原因数字拖了一周才出来,base还不错比现在高,但考虑到州稅。。。股票最后给涨到1个米,整个过程非常爽快,我也很开心。我知道有牛人能要到更大的包裹,不过我想自己满意了就好。朋友说他家每年给的refresher也比较给力,想起来纸面上的数字还是很吸引人。当然,如果没上市,就是一堆废纸。这次也是我第一次真正经历negotiation,最后效果也还满意,也从朋友那里学习了不少讨工钱的经验,如果有人有兴趣可以私信我,或者下次有空写写。

大结局

最后选了P,钱给的满意是比较小的一个因素,其实U给的也算还不错了。其他很多个人的考虑,这里就不多说了。至于公司前景的比较上,不用说U的吸引力非常大上市几乎是必定的,P的风险相比更大,但是潜力也不错,团队也很强,我觉得拼趣的monetization做的不错,感到有比较强的信心。有机缘跟很多p家的国人接触过,觉得他家中国人多而且友善团结融洽,这点很喜欢。

FLGUA面经

经验:phd+2yr。最近面了FLGUA,运气比较好拿到了FLGU。各家的包裹基本都差不多,G略多,最后从了G。对于大家关心的U,最后给了140k的base和不到10k的RSU。也尽力抬了不过实在抬不上去。很羡慕版上可以拿到15k的兄弟。

A:基本都是面经里出现过得题,感觉他家的题库比较小。coding只有两轮,但是如果没跑出结果就肯定挂。另外他们家比较注重culture fit。最后有两轮扯淡的
1. 聊项目
2. 设计machine learning系统
3. Word ladder II
4. Alien dictionary
5&6. culture fit 扯淡

U: 非常看重design,对coding要求一般
1. 聊项目 + design dropbox
2. Design uber eat
3. Design uber
4. Coding: (1)给一个数组求不相邻元素所能组成的最大和; (2)给一个binary tree求不相邻元素所能组成的最大和。数字都可正可负。
5. hiring manager 扯淡

L:
1. 聊项目。设计tiny url
2. Roman to integer & integer to roman.要考虑输入不合法的情况,比如IIII就是不合法输入,IV才是对的
3. Machine learning系统设计:给一堆job posting,怎么提取job title和required skills。
4. 找出linkedin上的1-3跳好友。系统设计和算法实现
5. 考了点统计概念,怎么evaluate A/B实验的结果,怎么估计p-value和confidence
interval
6. 设计了一个people you may know的功能。还有一个设计题忘了

G:签了所以就不说细节了。反正全程coding,不问项目不聊简历也没做design题。总考的比其他家难

F:基本都是面经题,没啥surprise
1. 聊项目,然后做了个sort color
2. 设计一个facebook上的好友推荐系统
3. coding:给一堆用字母表示的tasks和相同tasks之间的最短时间间隔K,求出完成所有task所需要的最短时间。比如tasks是AAA,K=2,那最短时间就是5(A_A_A);如果
tasks是AABBCC,K=3,那最短时间就是6(ABCABC)
4. 还是跟recommendation相关的设计,具体忘了。然后写了个clone graph
5. 设计一个facebook功能:在一个post下面,如果有了新的comment,可以自动显示,不需要刷新后再显示。