BUsy Street Scenes (BUSS) Dataset

BUsy Street Scenes (BUSS) is a challenging dataset of video sequences taken from a handheld mobile phone (an OPPO A5 2020 smartphone, rear camera) in crowded city streets with synchronized inertial measurement unit (IMU) data. The goal of the dataset is to evaluate the robustness of camera rotation estimation algorithms in dense and dynamic scenes with many moving objects and complex camera motion. The dataset composes 17 video sequences of about 10 seconds each at 30 FPS in full HD resolution (1920x1080) RGB. We used the Android Open-Camera Sensor app to synchronously record video and angular rate from the phone’s MEMS gyroscopes (at 400Hz) and then generated the rotation ground truth.

Ground Truth

The ground truth rotation at frame f_t represents the forward rotation from the video frame f_t to the immediate next frame f_{t+1}. To get the rotation between two frames, we numerically integrate angular rate measurements. The ground truths are represented as quaternions.

Camera

Our videos are recorded using the OPPO's rear camera at full HD (1920x1080) @30Hz in RGB.

Camera Settings

Resolution 12.0 MP (1920x1080)
Aperture f/1.8
ISO 90-1600
Shutter type Rolling shutter

Intrinsic Parameters

We used an 8x6 checkboard (25mm squares) and the Matlab tool to calibrate the intrinsic parameters of the OPPO rear camera:

Focal length (pixels)

Focal length in x and y, represented as a two-element vector [fx fy] in pixels.

[1.5980e+03, 1.5764e+03]
Principal point (pixels)

Coordinates of the optical center of camera, represented as a two-element vector [cx cy] in pixels.

[969.1056, 514.1442]
Image size

Image size produced by the camera, represented as a two-element vector, [mrows ncols].

[1080, 1920]
Radial distortion

Represented as a three-element vector [k1 k2 k3] which are the radial distortion coefficients of the lens.

[0.2679, -1.1773, 1.6158]
Tangential distortion

Represented as a two-element vector [p1 p2] which are the tangential distortion coefficients of the lens.

[-0.0084, 0.0019]
Skew

Skew of the camera axes, a scalar value. The skew is 0 when the x and y axes are perpendicular.

1.0089

For a better decription of the camera intrinsic properties, please refer to this Matlab documentation

Inertial Measurement Unit (IMU)

The sensor's coordinate system is defined relative to the phone's screen and follows the right-hand convention. The coordinate system is fixed and does not change with screen display orientation. Only the gyroscope was utilized.

Gyroscope settings
Model name lsm6ds3c
Average update rate 400Hz

Validating Ground Truth

Our experiments show a strong agreement between the rotation velocity of two different gyroscopes. It is highly unlikely for the two sensors to agree if their measurements were incorrect. Therefore, this strongly suggest that the gyroscope measurements is a good ground truth for frame-to-frame rotation estimation. For more details, please refer to the supplemental material.

Personally Identifiable Information (PII)

To meet strict privacy standards, videos are only captured in public places, and faces and other personally identifiable information (PII) is blurred.

BibTeX

When using this dataset in your research, please cite:

@article{Delattre2023RobustRotation,
  author    = {Fabien Delattre, David Dirnfeld, Phat Nguyen, Stephen Scarano, Michael J. Jones, Pedro Miraldo, Erik Learned-Miller},
  title     = {Robust frame-to-frame camera rotation estimation in crowded scenes},
  journal   = {2023 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year      = {2023}
}