Regularized optimization that minimizes the sum of a smooth and a nonsmooth function is widely seen in applications including signal processing, engineering, and data analysis, where the smooth term is for data fitting and the nonsmooth term is for promoting desirable properties in the solution. Optimization methods that utilize smoothness of the data-related term are usually preferred for this type of problems as they have close relation with the unregularized, smooth counterparts. In this dissertation, we study efficient methods for large-scale regularized optimization problems, covering second- and first-order methods as well as incremental methods that update a block of variables at a time. We provide theory for convergence speed, novel algorithms with efficient practical performance, and analysis for existing methods sharper than existing theory. We also discuss its application in distributed implementations, in which multiple machines are used to cope with extremely large-scale data sets.