Xiangpeng Hao

(he/him, pronunciations: Shyang-pung How)

Last update: October 2024.Non-academic Work with me
I'm a fourth year PhD student at the University of Wisconsin-Madison studying computer science with a focus on database/storage systems.
My PhD advisor is Remzi H. Arpaci-Dusseau. My PhD is funded (2024-2025) by to work on Apache DataFusion/Arrow/Parquet.

I worked on multiple database engines and key-value stores -- Arrow, DataFusion, Parquet, BigTable, Spanner, FASTER, Garnet, Bf-Tree, Two-trees, Congee.

My research solves today's problems and directly connects to users . I believe real impacts stem from the first-person experience of real problems. System research without being grounded by real systems, is a waste of intelligence and time.

People I have worked closely:
Tianzheng Wang -- my undergraduate advisor. He is a rock-star database researcher, I learned a lot from him. He introduced me to the database research, and I still love it.
Xiangyao Yu -- my PhD advisor for the first three and a half years.
Yixin Luo -- my intern mentor @Google. We did great work on database auto-tuning.
Badrish Chandramouli -- my intern mentor @MSR. He is a great researcher and mentor. He reasons what I said, and his attention to details is incredible.
Andrew Lamb -- my intern mentor @InfluxData. His passion and professionalism in DataFusion development have reshaped my research to connect more closely to real-world applications.
Design axioms (ordered)
  1. People-centric. I build systems for people to {use | build-upon | contribute}, not just for academic records.
  2. Correctness. I code in Rust, fuzz test all the core components, run systematic concurrency tests on all multi-threaded code.
  3. Performance, from keyboard to screen.
I'm an  existentialist.
  • Existence Precedes Essence. We exist for ourselves as self-making or self-defining beings, and we are always in the process of making or defining ourselves through the situated choices we make as our lives unfold.
  • Freedom. It is true that we are free to create ourselves, but it is also true that we are already created by our situation.
  • Nihilism. (1) individualism and loneliness (2) life is regulated and controlled by faceless bureaucrats.
Leap of faith: Be nice; Be good to society.

Good reads:

  1. Returning to Reims
  2. The Idiot
It’s the disease of thinking that a really great idea is 90% of the work. And if you just tell all these other people “here’s this great idea,” then of course they can go off and make it happen. And the problem with that is that there’s just a tremendous amount of craftsmanship in between a great idea and a great product. And as you evolve that great idea, it changes and grows. It never comes out like it starts because you learn a lot more as you get into the subtleties of it. And you also find there are tremendous tradeoffs that you have to make... Designing a product is keeping five thousand things in your brain and fitting them all together in new and different ways to get what you want. -- Steve Jobs
While building systems, I publish papers.

Bf-Tree: A Modern Read-Write-Optimized Concurrent Larger-Than-Memory Range Index.
Xiangpeng Hao, Badrish Chandramouli. (VLDB 2024) [more]

Shadow Filesystems: Recovering from Filesystem Runtime Errors via Robust Alternative Execution.
Jing Liu, Xiangpeng Hao, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Tej Chajed. (HotStorage '24)

Towards Buffer Management with Tiered Main Memory.
Xiangpeng Hao, Xinjing Zhou, Xiangyao Yu, Michael Stonebraker. (SIGMOD 2024)

Blink-hash: An Adaptive Hybrid Index for In-Memory Time-Series Databases
Hokeun Cha, Xiangpeng Hao, Tianzheng Wang, Huanchen Zhang, Aditya Akella, Xiangyao Yu. Proceedings of the VLDB Endowment (VLDB 2023)

Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs
Jiaxin Lin, Tao Ji, Xiangpeng Hao, Hokeun Cha, Yanfang Le, Xiangyao Yu, Aditya Akella Proceedings of the ACM on Measurement and Analysis of Computing Systems

PiBench Online: Interactive Benchmarking of Persistent Memory Indexes (Demo).
Xiangpeng Hao, Lucas Lersch, Tianzheng Wang, Ismail Oukid. 45th International Conference on Very Large Data Bases (VLDB 2020)

DASH: Dynamic and Scalable Hashing on Persistent Memory.
Baotong Lu, Xiangpeng Hao, Tianzheng Wang, Eric Lo. 45th International Conference on Very Large Data Bases (VLDB 2020)

Evaluating Persistent Memory based Range Indexes.
Lucas Lersch, Xiangpeng Hao, Ismail Oukid, Tianzheng Wang, Thomas Willhalm. 45th International Conference on Very Large Data Bases (VLDB 2020)

Evaluating Colour Constancy on the new MIST dataset of Multi-Illuminant Scenes.
Xiangpeng Hao, Brian Funt, Hanxiao Jiang. 27th Color Image Conference

A Multi-illuminant Synthetic Image Test Set.
Xiangpeng Hao, Brian Funt. Color Research and Application